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PREFACE 


VOCATIONAL psychologists arc frequently asked, “How good is the 
Kuder Preleience RecoicI (oi the Crawford Spatial Relations, or some 
other test)?’' The importance of such questions is brought out by the fact 
that, in one sear, 20,000,000 Americans took a total of Go,000,000 tests 
(2fi). Testing is indeed a “big business.” It is the aim of this book to pro¬ 
side the user of vocational tests with a detailed and objective answer to 
(j nest ions suc h as this, for a number of the most widely used and useful 
tests. This is done by bringing together the results of the significant re¬ 
search which has been clone with each of these tests, by interpreting these 
findings in the light of lecent developments in testing theory and practice, 
and by viewing each test in the pcrspccti\e gained by those who are cur- 
ientl\ using them in schools, colleges, consultation services, business, and 
industry. 

Hut the objective of this book goes beyond that of prosiding a manual 
of cuiienth usable tests, important though that is. In bringing together 
and intei preting the Jesuits of icsearch with existing tests, an attempt is 
made to fannliari/e the reader with the bibliographical sources and to 
take him through tlie processes of collection of data and synthesis of find¬ 
ings. so that hi* may develop the work habits and thought processes which 
will e nable him, as new research is published and as new tests are put on 
the market, to evaluate instruments himself and to make new applica¬ 
tions. Insofar as this goal is accomplished, the user of vocation tests will 
be enabled to keep abreast of progress in the field and to work on a high 
professional plane. 

In this process, the student should develop an understanding of the 
basic procedures of the development of vocational tests. It is true, of 
course, that most vocational counselors, psvehometrists, and personnel 
workers aie and should be primarily users and interpreters rather than 
constructors of tests. It is rare that real skill as test technician and as coun¬ 
selor are combined in one person. But, to be an intelligent consumer, 

xiii 
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one must be familiar with the procedures and problems involved in the 
development or manufacture of the product which is to be used. This 
does not necessitate skill in manufacture, but it does require detailed 
knowledge of methods, materials and problems. As each test is studied, the 
methods used in constructing, standardizing, and validating it will there¬ 
fore be described in some detail. The underlying assumptions will be 
pointed out, and the validity of the criteria used will be considered. Such 
knowledge is important in personnel selection, in which custom-built 
tests generally prove most effective, and in vocational counseling, in which 
generalizations are made on the basis of limited data. 

As a result, the reader should become well acquainted with the demon¬ 
strated values and limitations of the most widely used vocational tests. 
The word demonstrated should be emphasized, for during the past twenty 
or twenty-five years, and especially during the past decade, a great deal of 
research has been carried on and published on the validity of vocational 
tests. There is no longer any excuse lor depending primarily on hunches 
as to the vocational significance of special aptitude tests, nor for going to 
the other extreme and concluding that, since “a test tests only what it 
tests,” one can conclude nothing from psychological test results concern¬ 
ing vocational promise. Both ol these attitudes and practices were* wide¬ 
spread during the icon’s, when validity data were sketchy and often dis¬ 
appointing. For example, the O’Connor Tweezer Dexterity l est was fie- 
quently used as one indicator for dental training, on the basis of the test 
author’s unsupported statement that it should be valid for dentistry, and 
on the basis of logical analysis. Some counselors and personnel workers, 
however, impressed by the lack of expected validity in some of the tests for 
which criterion data had been obtained, telused to concede any predictive 
value to tests, maintaining that aptitudes are too highly specific for per¬ 
formance on one laboratory task to pi edict performance in a real life 
situation. Enough data have now been accumulated so that a more 
realistic and pragmatic approach is possible: the counselor can know, 
from experimental evidence, a good deal about the nature of the tiait 
being measured and about its role in vocational adjustment. His interpre¬ 
tations of test results can therefore be based on objective evidence or, 
when the evidence does not go far enough, on logical analysis which uses 
fact rather than fancy as a starting point. 

It is not meant to imply, however, that we now know all we need to 
know about aptitudes and interests, nor about the instruments which we 
use to measure them. On the contrary, there are still many gaps in our 
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knowledge, some of them surprising indeed after a generation of creative 
work. For example, such a simple question as that of the maturation of 
clerical aptitude as measured by the Minnesota Vocational Test for 
Clerical Workers (speed and accuracy of name and number discrimina¬ 
tion) has not been answered, despite some beginnings; or, putting it in 
practical rather than theoretical terms, we do not yet know at what age 
it is legitimate to use adult norms for the Minnesota Clerical Test, and 
at what ages comparison should be made only with boys or girls of the 
same age level. The question of the relationship between two and three- 
dimensional spatial visualization has not yet been finally answered, funda¬ 
mental though it is to the use of the Minnesota Spatial Relations and 
Paper Form Board Tests in shop work as opposed to drafting. Even apart 
from somewhat theoretical questions there is still much to be done. T he 
norms for one of the most valuable group tests of intelligence, the Ameri¬ 
can Council on Education Psychological Examination, for example, are 
still entirely based on college freshmen; research has shown that scores in¬ 
crease with age in college (see p. 115); but we ha\e practically nothing 
concerning the occupational significance of A.C.E. scores at any age, 
something that it would seem both logical and important to have for use 
in counseling college students. This point is dwelt upon briefly, partly in 
order to stress the fact that, although we know a great deal about the 
significance of many tests, there are still great gaps in our knowledge, and 
partly in the hope that the pointing out of some of these gaps will result 
in further research along lines which will round out our knowledge. 

One of the principal weaknesses in the measurement movement has 
been the excessive individualism of the research which has been carried 
on. Individualism has been good in that it has encouraged branching out 
in new directions and trying out new possibilities, but it has been bad in 
that it has resulted in the scattering of efforts and in the frequent drop¬ 
ping of a good idea after it has been barely tried. For every research 
project comparable to Strong’s persistent study and refinement of his 
Vocational Interest Blank throughout the past twenty years, there arc 
several like Zyve’s Scientific Aptitude Test and Bernreuter’s Personality 
Inventory, whose initial promise have never been adequately explored. 
This is partly because the test authors, often for excellent reasons, did not 
follow up their initial work, partly because the research carried on by 
other people with these instruments has generally been unco-ordinated 
and incidental. 

For test development work to be fully effective, two things are needed 
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in addition to those which have characterized it so far. One of these is the 
periodic and systematic review of work with specific tests or types of tests. 
This should be more detailed, critical and creative than the periodic re¬ 
views published in the Review of Educational Research by the American 
Educational Research Association; it should be more regular and more 
co-ordinated than the excellent reviews which occasionally appear in the 
Psychological Bulletin and Psychological Review as a result of the activi¬ 
ties of individual psychologists; and it should be more integrated and 
pointed toward action than Bums’ Mental Measurement Yea) hooks 

It is hoped that this book will serve this purpose, pointing out im¬ 
portant research that needs to be done to round out our knowledge of 
vocational tests and stimulating psychologists, \ocational counselors and 
personnel workers to carry out appropriate research projects. Theie 
should in time be a committee of the Ameiican Psychological Associa¬ 
tion, the National Vocational Guidance Association, and the American 
Management Association whose function it is to plan and co-ordinate 
such critical and constructive reviews. The second major need in the 
development of \ocationaI testing is an extension of this function from 
systematic review and suggestion to systematic planning and execution of 
research. Such a committee should take the initiati\e in encotiiaging re¬ 
search along needed lines, partly by publications and talks at professional 
meetings, and partly by a program of grants-in-aid of suitable rescan h. As¬ 
sistance should even be provided in planning and financing major re¬ 
search projects for the large-scale study of a number of important and te- 
lated problems. The Minnesota Mechanical Abilities Project of the njjo’s, 
the Minnesota Employment Stabili/ation Research Institute of the lego's, 
Strong’s work in vocational interests, Thurstone’s wot k on piimaiy men¬ 
tal abilities, Kuder’s work on primary interests, the Tnitcd States Em¬ 
ployment Service’s work on the development of basic occupational test 
batteries should be multiplied and, in some cases, expedited as the\ could 
be only in a nationally sponsored and co-ordinated plan. 

A few words should be said about the selection of tests discussed in this 
book. No attempt is made to cover all tests, or even all tests of some 
value. Annotated catalogues of tests ate available from puhlishcts and 
distributors such as the Psychological Corporation, Science* Research As¬ 
sociates, World Book Company, and California Test Bureau. I’oo manv 
treatises of testing are little more than annotated catalogues. Instead, a 
number of tests have been selected for detailed consideration because thev 
measure aptitudes or traits of demonstrated importance, arc typical of 
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others designed to measure the same characteristics, arc as readily ad¬ 
ministered and scored as others of their type, and, particularly, have been 
sufficiently studied so that something is known both concerning the 
nature of the characteristics measured and the validity and usefulness of 
the measuring instrument. In a few instances this last, most fundamental, 
consideration has been departed from in order to permit brief discussion 
of what appears to be a promising technicjue deserving of more extensive 
and thorough study. In addition, briefer mention is made of certain other 
tests which merit discussion because they are widely known even though 
of little or no vocational value. Discussion of some tests seems forced 
upon one by the extent of then use in industry, even though neither their 
proved nor probable value to the counselor or personnel man justifies 
giving them space. Similarly, the use of Wechsler-Bellevue part-scores as 
indices of special aptitudes by many clinical psychologists dealing with 
pioblcms of vocational adjustment makes it necessary to consider that 
topic, even though there is as vet little occupational evidence to justify 
such a piactice. 1 he tests discussed include all but six of the .jo tests 
listed hv Berkshire el al. (Syp as most commonly used in guidance centers, 
plus seveial less widelv used but otherwise important instruments. 1 hese 
authors state that some 20 of the* tests surveved appear to be “basic to the' 
guidance iunction.” Similatly. the gTcat majority of tests found, in a con¬ 
fidential survey of industrial testing, to be widelv used are included in 
this treatise. 

Apart from the annotated catalogue approach which has character i/ed 
a numbe r of books on testing, several other approaches arc possible. One 
of these is the- introduc torv suivev of measurement theory and practice*. 
E. Ik (.1 erne's Meuun ements of Human Iiefureior (yocp is one of the most 
widely used examples. I his present book differs from such texts in that 
it assumes a knowledge* of the* fundamentals of measurement (of which a 
review is provided 111 Xppendix \ for those who need it), and in that it 
deals with the* problems, methods, and results of vocational testing in an 
intensive* and eomptehensive manner. It is designed to serve both as a 
handbook for counselors, psyehometrists, and personnel workers actually 
using tests in practice, and as a tc*xt for courses in the use of tests in 
counseling and selection. 

Another approach in a book or course on testing is to teach the tech- 
nit]ues of test construction and validation. Clark Hull's latitude Testing 
(385), a classic in this field for more than a decade after its publication in 
the mid-twenties, illustrates this emphasis. Adkins' Construction and 
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Analysis of Achievement Tests (7) is a more recent manual, written loi 
personnel selection. Thorndike’s Personnel Selection (833a) is another. 
There is a need for a text of this type, for use in courses on text construc¬ 
tion, but this book docs not attempt to meet botli needs. 

Still another approach is that embodied in Walter V. Bingham’s Apti¬ 
tudes and Aptitude Testing (9^), published under the aegis of the Na¬ 
tional Occupational Conference in 1937, for a decade the standard text 
in courses on testing in vocational guidance, and now undergoing re¬ 
vision. In his book, Bingham focuses attention on the constellations of 
abilities that play a part in success in the major occupational fields. This 
occupational orientation is important, but in stressing it, something more 
important to the user of tests in actually understanding a person who has 
been tested is neglected. This is the consideration of the question, “what 
does this test , and the score made on it by this person, tell me about his 
;vocational promiseV* 

It is around this question that the author has attempted to organize 
this book. Experience as counselor, personnel consultant, supervisor and 
instructor has shown that the user of vocational tests in diagnostic work 
starts with data about the client , which he then synthesizes and interprets 
in terms of vocations. It is true that he needs to make a decision as to 
what vocational goals arc likely to be considered in order to selec t ap¬ 
propriate tests, and that this requires thinking in terms of occupations 
and constellations of abilities. However, test batteries for occupational 
families are not yet developed to a sufficient, degree to make this the best 
approach in actually interpreting test results and counseling. Instead, the 
psychologist or counselor must tease what meaning, suggestions, and 
contra-indications he can from the test and other personal data on hand. 
In some of the most effective vocational counseling and personnel evalua¬ 
tion services vocational tests arc used, not only for the occupational norms 
which permit comparison with successful workers, but also for the analysis 
of the psychological strengths and weaknesses of the client, which arc then 
interpreted in terms of possible vocational opportunities. This latter type 
of analysis requires thorough knowledge of the tests used, supplemented 
by detailed knowledge of occupations from first-hand experience and 
from psychological research. This book therefore considers the topic 
stressed by Bingham, but emphasizes that which he played down, in the 
belief that this is more helpful to the user of tests. Another unique feature 
in a text such as this is the material on the use of test results in counseling, 
that is, on putting test results to work. 
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It is the writer’s belief that this book should be of special value to voca¬ 
tional psychologists, personnel workers, and counselors in another way. 
Great progress has been made in testing for vocational selection and 
guidance (the two go hand-in-hand) during the past ten years. Much of 
this work has been published in the journals and monographs; much is 
still in the files of the military services; some is simply part of the folk¬ 
lore of vocational testing and counseling, known to some of those en¬ 
gaged in such work. The writer hopes that, in drawing from intimate 
knowledge of these several sources, he has been able to make the most im¬ 
portant of these advances available to users of psychological tests in voca¬ 
tional guidance and selection. If the work of the Aviation Psychology 
Program of the Army Air Forces has been drawn on more extensively 
than any other single source, it is because the comprehensiveness and 
thoroughness of that program made it a unique source of materials on 
personnel testing. 

In using this book as a text in a graduate course in vocational testing, 
the author uses four other instructional aids which may be of interest to 
other instructors. Although they have been developed to supplement the 
book, they are independent of it as it is of them. One of these aids is a 
Sourcebook for Vocational Testing (Teachers College Bureau of Publica¬ 
tions'), containing photo-olfset leproductions of a number of the more 
significant original articles on the tests dealt with in this book; it is used 
to facilitate at cess to journal material and to train students to use reports 
of original research in evaluating and understanding tests. The second is 
a Kit of Vocational Tests assembled by the Psychological Corporation 
and the College Bookstore; it contains manuals, scoring keys, and test 
blanks for all paper and pencil tests studied intensively in the course (the 
major tests treated in this book). Students thus have easy access to manuals 
and kevs, and start their own test libraries. The third aid consists of 
copies of catalogues of selected test publishers, giving complete data on 
ordering and costs; this makes it unnecessary to include such transitory 
data in the text. The fourth aid consists of a well-equipped testing labora¬ 
tory, in which supervised practice in testing is given. 

It is with mixed feelings that the author parts with this manuscript. 
Based as it is on the findings of research in a rapidly developing field, it 
is inevitable that even before it comes off the press some of the questions 
which have been mentioned as unanswered will have been answered bv 
new investigations. Some ol the conclusions may soon need modification. 
The indulgence of the reader is therefore requested when he finds that 
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the facts on which a generalization is based have changed. The material 
in this book should nevertheless be of vital importance, if only as a back¬ 
ground against which to see the findings of new studies as they appear It 
is the writer s intention to revise it periodically, as new tests and new 
findings require. Fie will therefore welcome the co-operation of authors of 
research studies in sending him reprints of papeis bearing on the sub¬ 
ject of this book. By cutting down the time consumed in bibliographical 
research, this will make it easier to survey relevant material, improve the 
coverage of the book, and speed up the preparation of re\isions. 

The acknowledgments due to others in connection with the ptepaiation 
of this book are numerous, varied, and a source of such pleasure that 1 
have looked forward to the writing of these paragraphs. 

First, their are those from whose wot k 1 ha\e leai tied much of what I 
know about testing: Professor Donald G. Pateison, Dean Edmund G. Wil¬ 
liamson, and Dr. John G. Dailey, of the University of Minnesota, the 
first-named an unseen friend whose correspondence over a priiod of sev¬ 
eral vears has added to the professional stimulation ptovided by the pub¬ 
lications of the Minnesota reseatchers; Dr. Edward K. Strong, Jr., of Stan¬ 
ford University, whose work in the measurement of interest first aroused 
my interest in measurement; Drs. Faurance F. Shaffer, Neal E. Millet, and 
Robert R. Blake, at one time officially and respectively my chief, as¬ 
sociate, and assistant in the Aviation Psychology Piogram ol the* Army 
Air Forces, but actually my helpful and stimulating colleagues in a num¬ 
ber of research projects; Dr. jolm C. Flanagan, now of tlu* American In¬ 
stitute for Research, formerly director of the Aviation Psychology Pro¬ 
gram of the At my Air Forces, whose vision and singleness of pm pose 
made that program both a landmark in the field of psychometric s and a 
most worth while professional experience for those involved in it; and 
Dr. Ilarry D. Kitso n, my senior colleague, whose interest in impunmg 
the understanding of vocational tests by their consumers has been a con¬ 
stant encouragement in the preparation of this hook. 

Secondly, there are those who have contributed to the actual writing of 
the book by their careful reading and criticism of parts of the manuscript. 
Dr. Kitson read the first draft in its entirety, applying his skill and per¬ 
spective as editor of Occupations to the broader problems of organization, 
presentation, and interpretation. Professor Paterson read selected chapters 
of which his experience in test construction and perspective as editor ol 
the Journal of Applied Psychology made him a valued critic. Dr. Shader 
also found time in his busy schedule as Chairman of the Department of 
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Guidance, Teachers College, Columbia University, and editor of the 
Journal of Consulting Psychology, to read the introductory chapters, the 
chapters on tests ol intelligence and of personality, and those on the use 
of test lesults, with unusual cute and discernment. Mi. Bruce Shear, Di¬ 
lector ol Pupil Personnel Sen ices lor Northern Westchester County, has 
made practical suggestions concerning certain chapters, (diaries N. Mor- 
iis, my junior colleague; Stewatt Murray, Director of Guidance for Nova 
Scotia; Vernon Wallace, Counselor at Brookhn College. Davis Johnson, 
Counseloi in the Vocational Counseling Sen ice of New Ha\en; Joseph B. 
Shaw Psvchologisl in the* Jewish Vocational Service of Detroit; and David 
I.ane, Associate Ditectoi ol the Veterans’ Guidance Service, Claik I'ni- 
\ersitv, lead pa11s ol the* manuscript as giaduate students, checking many 
details, pointing out piofcssoiial obsemities, and encouraging me with 
then constant inteicst. 

T liiiclly. the re ate the authors and publisheis who have graciouslv made 
possible 1 cpiotatioii irom their works, paiticularlv to the American Book 
Co., the- \meiican Psvchological Association, ilenrv Holt and Co., the 
Houghton Mifflin Co., Dr. G. Freddie Ruder, the McGraw-Hill Book Co., 
Occupation s, the Pswhologicai Corporation, the Science Reseaich Asso¬ 
ciates. the* Social Science Research Council, and the Stanford Unixeisitv 
Press, in addition. Dr. Haiold G. Seashore of the Pswhologicai Corpoia- 
tion and Mr. John R. ^ ale ol the Science Reseaich Associates cooperated 
in supplving data and checking facts concerning certain tests. 

The final wold has been saved foi the women and the children. Miss 
Esther Giossmatk, tm secretary, has with patience and persistence super¬ 
vised part-time typists and sandwiched the tv ping of parts of the manu¬ 
script into a heaw workload, strengthened, no doubt, by the special in¬ 
terest of a student of psschology. And my wife and sons have cheerful}} 
spent innumerable weekends and evenings in other parts of the house 
and garden while the typewriter hammered a wav in the study, breaking 
the monotony occasionally with a pleasant word or an excited account of 
some neighborhood event. 

Donald E. Sltef 

Montclair, N.J. 

February, igjg 
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TESTING AND DIAGNOSIS IN VOCATIONAL 
COUNSELING 

The Nature and Purposes of Vocational Guidance and Counseling 

VOCATIONAL counseling has two fundamental purposes: to help 
people make good vocational adjustments and to facilitate the smooth 
functioning of the social economy through the effective use of manpower. 

These purposes imply that each individual has certain abilities, in¬ 
let ests, personality traits, and other characteristics which, if he knows 
what thev ate and how they may he turned into assets, will make him a 
happiet man, a mote effective wot her. and a more useful citizen. Part of 
his education, that is, literallv, “leading him out” or guiding his develop¬ 
ment and unfolding, theiclore consists of helping him to get a better 
understanding of his aptitudes for acquiring \arious skills, his adapta- 
bilit\ to difleiing types of situations, and liis interest in the numerous 
activities in which he might engage. Although less generally recognized 
as such, this self-understanding is just as much an objective of education 
as is the development of an understanding of the woild in which he lives. 
A well-educated man is one who has achieved both tvpes of understand¬ 
ing; a well-adjusted man is one who has been able to put these two tvpes 
of knowledge to good use and has found a place for himself in societv. 

Some educational piogiams have assumed that the processes of mental 
discipline, intellectual development, and general education would result 
in the desired self-understanding. However legitimate this assumption 
might be in an effective educational program, the result is not achieved 
in practice: the Regents Inquiry into the Character and Cost of Public 
Education in New York State, as icported in the monographs bv Eckert 
and Marshall (inj.j) and by Spaulding (729). made it clear that a large 
proportion of the products of our more or less traditional school systems 
have neither the self-understanding nor the understanding of the world 
around them that is necessary for good vocational adjustment or citizen¬ 
ship. This lack of self-insight and of social understanding has been re- 
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vealcd by numerous other studies of the relationship of the vocational 
aspirations of youth to their abilities and to the opportunities open to 
them (793: Ch. 2). 

This being the case, vocational guidance is needed, to focus attention 
on the information about self and occupations that is needed for good 
vocational adjustment and to guide the development of a genuine under¬ 
standing and acceptance of these facts. Vocational guidance is, therefore, 
a dual process of helping the individual to understand and accept him¬ 
self, and of helping him to understand and adjust to society; it is both 
psychological and socio-economic. 

What are the psychological processes necessary to bring about the 
understanding which experience alone so often fails to produce? They 
are, of course, those of vocational counseling. And what is vocational 
counseling? It is the process of helping the individual to ascertain, 
accept, understand, and apply the relevant facts about himself to the 
pertinent facts about the occupational world which are ascertained 
through incidental and planned exploratory activities. The techniques 
of vocational counseling vary from case to case and from counselor to 
counselor, depending partly upon the counselec’s state of readiness and 
partly upon the time available to the counselor, the degree of skill he 
has attained, and his philosophy of counseling. In many cases these 
techniques fall naturally into two categories: those of diagnosis and 
those of treatment or counseling in the more limited sense. There is, 
however, one important school of thought in guidance which is some¬ 
times described as opposed to the use of diagnostic activities, at least 
of the traditional varieties and in the traditional ways. This point of 
view has been most ably and widely propounded by Carl Rogers and 
his students (639,640,641) and is known as nondirecthe counseling. Be¬ 
fore embarking upon a discussion of the techniques of diagnosis prefatory 
to the intensive study of diagnosis through tests, some consideration 
should be given to this question of the role of diagnosis in vocational 
and educational counseling. 

To Diagnose or Not to Diagnose? 

Nondirective counseling is based on the assumption that the individual 
has, within himself, the resources necessary to the solution of his own 
problems. All that he needs, according to this theory, is a permissive 
situation, one in which he can release his energies and bring these re¬ 
sources into play. It is the counselor’s role to create this permissive 
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situation and to release these energies. He does this by creating a warm 
and understanding atmosphere, by accepting and reflecting the feelings 
of the client, and thus making it possible for the client to work out his 
problem in his own way. 

Nondirective counseling originated in the treatment of behavior prob¬ 
lems by child guidance workers such as Jessie Taft working under the 
leadership of Otto Rank, and was referred to by them as passive, as con¬ 
trasted with active, relationship therapy. It was further developed by 
Rogers in working with the more normal personality problems ol 
adolescents and adults as well as with children’s behavior problems 
(638,639,64 1); he and his students did a great deal to clarify the princi¬ 
ples, systematize the procedures, and broaden the applications of passive 
relationship therapy in research and in teaching as well as in clinical 
work; in the process it was renamed nondirective therapy. Having 
demonstrated the values of nondirective counseling in dealing with 
certain types of personal adjustment problems, Rogers and some of his 
students have moved on to consider its application to problems of vo¬ 
cational and educational counseling (166,173,640). 

Having worked primarily in clinics and with the mild and moderate 
neurotics who turn to psychological clinics for help in quite dispropor¬ 
tionate numbers, Rogers has been impressed by the number of presumed 
problems of \ocational adjustment which turn out to be problems ol 
personality adjustment: “For the nondirective counselor, vocational and 
educational difficulties are personal problems.’’ “Following the view¬ 
point of this manual will usually demonstrate that the statement of a 
vocational or educational problem really disguises a deeper personal 
problem that must be handled before any real progress can be made on 
the manifest difficulty” (641:90 and 104). 1 If Rogers and his students had 
worked in more normal situations, with a more typical sample of adoles¬ 
cents and young adults, they would have found that a larger percentage 
need vocational guidance but have no significant personality problems 
and are ready for the “progress ... on the manifest difficulty,” for 
which, as Rogers states, neurotic clients are ready only after psycho¬ 
therapy. The average high school pupil and college student docs not 
need this (322). Indeed, one of Rogers’ students who works in a university 
guidance center reports that nondirective counseling seems appropriate 
in about twenty percent of the cases seen in that center (Arthur Combs, 

1 By permission from Counseling with Returned Senneemen. bv C. R. Rogers and 
J. L. Wallen, Copyrighted 1945. Houghton-Mifflin Co. 
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in an address at the 19jh Regional Conference of the Council of Guid¬ 
ance and Personnel Associations, Hotel Pennsylvania, New York). Moic 
research needs to be carried out on this question before a definite con¬ 
clusion can he drawn, but the evidence so lai suggests that what Rogers 
has demonstrated with clinic cases cannot be applied without modifica¬ 
tion to school, college, and normal adult cases. 

This being so, Rogers’ injunctions against diagnosis (e.g., (>.{1:5-6) can¬ 
not be lifted from his discussions of psychotherapy and applied to voca- 
tional counseling This is not the place to dwell upon the adequacy of 
Rogers’ views on the wisdom of a\oiding the diagnosis of personality 
problems (see Patterson, 594), although it might be pointed out in passing 
that he does advocate some diagnosis when he writes (hji:ioj): “ 1 he 
meaning of the personal relationship must be assessed [italics mine]. What 
use is the client attempting to make of his lelationship with the counse¬ 
lor?” More important here is the fact that he states, in discussing a case 
(^l ,: 9-l), “True, the information was impoitant in helping him to 
evaluate himself more realistically than he had pieviouslv, but only 
because the counselor allowed him to work through his attitudes and 
feelings about the situation in the light of the new information.” In 
other words, diagnosis skillfully done, at the* right stage, and integtated 
with the counseling, is often desiiable. 

As the writer sees it, Rogers’ sketchily expressed and scanned views 
on diagnosis in vocational counseling amount to this: many cases which 
seem to be pioblems of vocational and educational counseling aie in 
leality personality problems, and therefore it is wise to use uondiicc live 
techniques at least in the fust contact in order to establish the natuie of 
the real problem; if or when the real problem is vocational or educa¬ 
tional, the diagnostic use of tests may piovide needed and valuable in¬ 
formation concerning the client which he will want to take into account 
in making his plans; when such information is obtained and used, its 
emotional significance to the client needs to be worked out by non- 
directive methods, especially if the client is also working through prob¬ 
lems of personality adjustment. 

Bragdon (116:81) and Fisher and Hanna (257) have reported in early 
studies, and the writer has pointed out in his text on vocational guidance 
(793:205,207,215), that many problems which appear to be vocational 
and educational are in reality personal; this has been a widely accepted 
fact among vocational counselors. The evaluation of the client’s reaction 
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to diagnostic data shared with him, combined with discussion (sometimes 
nondirective and sometimes rather directive in nature) designed to help 
him understand and accept the facts is similarly an old and widely-used 
technique of vocational counseling, as one who has observed many ma¬ 
ture and experienced vocational counselors at work can testify. Rogers’ 
contribution seems to have been to stress these facts in a way which have 
brought them to the attention of other counselors who have been more 
directive in their approach and who have tended to emphasize their own 
diagnostic activities at the expense of the client’s understanding. 

If Rogers’ views are contrasted with those expressed by Williamson in 
his book on counseling (928:133-1.j2) the error into which those who rely 
too much upon tests, or are primarily interested in problems of diagnosis, 
too easily fall will become clear. The type of counseling outlined therein 
is quite directive; as Darley expresses the same point of \icw in another 
booh (190:1 bej), “the interview seems somewhat similar to a sales situa¬ 
tion. since the counselor attempts to sell the student certain ideas about 
himself, certain plans of action, or certain desirable changes in atti¬ 
tudes.”* The assumption is that since the counselor obtains the significant 
information by technical methods and is better qualified to understand 
their significance than the counselee, lie should seek to convev the in¬ 
formation to the client by rational means and to get him to adopt an 
appropriate plan ol action. To quote Williamson (928:13b): “Ordi¬ 
nal ilv the counselor states his point of view with definiteness, attempting 
through exposition to enlighten the student.” 3 Williamson’s fallacy, like 
that ol mam who have been concerned more with the development of 
diagnostic techniques than with the development of individuals, seems 
to have been to expec t the counselee to gain insight by the same rational 
processes used bv the counselor in making a diagnosis. 4 As many other 
counselors have long known, and as Rogers has verv effectively reminded 
us. the insight-gaining piocesses of the counselee are affective and not 
cognitive, they are emotional rather than rational. When objective evi¬ 
dence is shared with the client his subjective reactions to it need to be 
aired and examined in a way peculiarly suited to nondirective inter 

2 Bv permission from Testirig and Counseling in the High School Guidance Program. 
1 >\ J. G. Darlcv. Copvrighted 1913. Science Research Associates. 

3 By permission fiorn How to Counsel Students. b\ F. G. Williamson. Cops righted 
1939- McGraw-Hill Book Co. 

4 I he reader ma\ wish to refer to the original context, as Darle* has indicated the 
belief that such quotations do not adequateh represent this \iew ^ee f 4f>f>l. Psychol . 
‘94L 28, 179-180. 
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viewing. If this type of diagnostic activity is carried out well, progress in 

vocational adjustment will be facilitated. 

Data Needed in Vocational Diagnosis 

In order to evaluate a person’s vocational prospects, two types of in¬ 
formation about him are needed: the psychological facts which describe 
his aptitudes, skills, interests, and personality traits, and the social facts 
which describe the environment in which he lives, the influences which 
are affecting him. and the resources which he has at his disposal. To de¬ 
pend upon one type of fact to the neglect of the other is to be un¬ 
realistic and to disregard important elements in vocational adjustment, 
for the opportunities available to persons with similar aptitudes and 
interests may vary greatly, just as the abilities and traits of people in 
similar social situations differ from one person to the next. It has, for 
instance, been demonstrated that many young men and women capable 
of benefitting from a college education do not attend college because of 
financial handicaps (234), just as many students who can afford to attend 
college drop out because of learning difficulties. 

The fact that many psychological characteristics arc best judged b\ 
means of tests which require special study and have the appearance of 
objectivity and concreteness has often led to the relative neglect of social 
factors in counseling by those trained to use tests, and to the neglect ol 
important psychological factors by those not trained to use tests. For 
these reasons it seems desirable, in considering the types of data needed 
in vocational diagnosis, to stress the need to obtain both types of in¬ 
formation and to use both testing and non-testing techniques. More will 
be said later about the methods of gathering data; first let us focus on 
the types of data needed. 

Psychological data needed include information concerning the gen¬ 
eral intelligence of the individual, that is, his ability to comprehend and 
use symbols or to do abstract thinking. This academic aptitude is im¬ 
portant not only in school situations, but also in everyday life situations 
in which ability to analyze a situation or a problem, to draw conclusions, 
to generalize, and to plan accordingly, is needed. Special aptitudes must 
also be explored. The work of recent years has shown that what has been 
thought of as general intelligence is, in reality, a combination of special 
aptitudes such as verbal comprehension, arithmetic reasoning, and 
spatial ability (281). For this reason data concerning strength or weak¬ 
ness in any one of these special areas must be obtained. Other special 
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aptitudes which play a part in clerical, technical, musical, artistic, and 
manual activities must be known. The subject’s interests, attitudes, and 
personality traits need to be assessed, in terms of their vocational implica¬ 
tions. And finally, data are needed as to the degree of proficiency which 
he has attained in using any of the skills which he has acquired. 

Social data are needed in order to provide a framework in which to 
interpret the psychological data. The occupational level of the parents 
plays an important part, for example, in determining the vocational 
ambitions of a youth and in his drive to achieve them, as well as in fixing 
the financial resources upon which he can draw in furthering his ambi¬ 
tions. The vocational achievements of the subject’s brothers and sisters 
may be indicative of his own probable level of achievement, but this 
prognosis is modified, in turn, by the age of the parents and their fi¬ 
nancial independence. It not infrequently happens that the youngest 
child fails to reach an occupational level as high as that of his siblings 
because of the need to contribute to his parents’ support just at the time 
at which he might have been going to college. The industrial and cul¬ 
tural resources of the home and of the community, the educational 
experiences of the individual, his leisure-time activities, and his voca¬ 
tional experiences all need to be examined, in order that the resources 
open to him and the use he has made of them may be understood. To 
draw the line between psychological and social data is obwously im¬ 
possible at times, for in finding out what influences have been at work on 
a person one also ascertains the ways in which he has reacted to them. 

Techniques of Gathering Data 

With the improvement of testing techniques it has become possible 
to measure an increasing number and variety of important psychological 
characteristics. In 1918 intelligence was the only psychological char¬ 
acteristic of vocational significance which could be effectively measured; 
in 1928 manual, mechanical, artistic, musical and spatial aptitudes, and 
vocational interest, could be added to the list, although the measures of 
these characteristics were then quite new and therefore relatively little 
understood. By 1938 a considerable amount of information had been 
gathered about and by means of these instruments, they had been refined 
and improved, and attitudes and clerical aptitude had been added to the 
list of measurable entities. In 1948, after the lapse of another decade, 
further improvements have been made in existing types of instruments, 
much more is known about them, and measures of personality have been 
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developed to a point at which they appear to have clinical validity even 
though their vocational significance is not clear. 

Despite the great progress in psychological testing since World War I, 
the variety of characteristics which can be measured still leaves a 
great deal to be desired. As is made clear in greater detail in subsequent 
chapters, the measuring instruments we now use even for the most ade¬ 
quately measured traits such as intelligence and vocational interest arc 
still crude and only half-understood; those we use for measuring per¬ 
sonality traits such as general adjustment, introversion and the need for 
recognition arc still in embnonic stages; and there arc no methods of 
testing creative imagination, persistence, and certain other traits and 
abilities which are often assumed to be important and whith laboratory 
studies and other types of investigations have suggested may actually 
exist. 

For these reasons the psychological study of a person’s abilities and 
personality traits requires more than testing techniques. W hen a suitable 
test is available, its use will generally sa\e time and obtain the inlorma 
tion in a more objecti\e, valid, and usable form than would otherwise 
be the case. This is especially true of intelligence, and it applies also to 
a variety of other traits. But some tests measure aspects of ability or in¬ 
terest which are so narrow as to make their use dangerously misleading 
unless the data obtained with them are thought of as being only one 
small part of the aptitude picture; lor example, the existing tests ol 
musical talent do not measure anything as broad as that term implies, 
but only certain minute aspects ol musical aptitude. I hey need to be 
supplemented by observation of musical performance, ratings by musi¬ 
cians, history of inteiest in musical activities, etc. As the major part ol 
this book is devoted to the uses ol \ocational tests, it is the purpose of this 
section to point out some things that tests can not now do rather than to 
show ways in which they are useful. It aims to indicate briefly the non¬ 
testing techniques which must lie used in order to obtain a well-rounded 
picture of a subject, rather than to discuss the useful testing techniques. 

The interview is the most widely used subjective method of gathering 
personal data, as well as the principal treatment or counseling technique. 
In diagnosis as in counseling, there are traditionally two divergent points 
of view concerning interviewing. In one approach, the emphasis is on 
careful planning, in having a well-thought-out interview schedule ot 
form which is to be completed during the interview. The interviewei 
asks direct questions, using the phraseology of his schedule and adhering 
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to the order in which the cjnest ions appear on the schedule. In the other, 
the nondirective approach, the interviewer merely sets the topic (“struc¬ 
tures the situation”), then accepts and reflects feeling in order to let the 
person being interviewed lead the discussion into the areas which are 
most important to him. Although the interviewer may not gather data 
on exactly the topics which he had considered important, he does obtain 
mateiial on the* problems which are ol most importance to the client, 
and therefore mos( important lor diagnosis. The Hawthorne Study well 
illustiates the development of this technicpie (657: Gh. 13). A commonly 
used pioccdmc is the patterned or semi-structured inters iew, in which 
the iinei viewer uses the schedule only as a guide. In this semidirective 
t\pe ol diagnostic intei \ iewing, the essence ol the technique is to use key 
questions as a means of getting the person being interviewed to talk 
lieelv on important topics, in the anticipation that desiied tacts will be 
biought up in a context which makes the ir interpretation mote complete 
than it would be if the facts were given briefly and in response to a 
diiect question. In either type of data-gathering interview, and especially 
m the less dncctive tvpe, it is possible to obtain inhumation not onlv on 
factual items such as those normally coveted in the social historv, but 
also on attitudes, ambitions, and other affective matters which con¬ 
stitute- the psvc hologieal case historv (see <jh, and 7GS: Gh. 3 and ,p for 
detailed disc ussions). 

(lurstmntunm are frequently used in order to obtain data such as are 
commonly gatheted in the- interview. I he writer has demonstrated that 
with liteiate subjects who want to co-operate this is an effective time- 
savet in collecting factual material (Sop, but it is much less useful than 
the interview as a means ol gaining insight into the attitudes and feelings 
ol am but the most it auk and insightful of individuals. Research by 
Landis (p, 1) and othets has shown that factual items are generallv re¬ 
potted with c onsidei able accuracy when the subject has come for coun¬ 
seling. although there is evidence (hyi) that others, whether subjected 
t > diagnosis against the n will or under sc rutinv as applicants for posi¬ 
tions, yield to the piessuie to falsify facts and improve appearances as 
much as thev consider possible. Useful material on attitudes can some¬ 
times be gatheted bv questionnaire methods, often by transforming the 
questionnaire into an attitude 1 scale, but Spencer (733) has shown that the 
truthfulness of material obtained depends on the anonymity of the re¬ 
sponse and, by infeience, on the confidence of the lespondent in the 
pel son using the data. Sv monels (810: Gh. .}) has discussed the details o£ 
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questionnaire construction at some length, pointing out steps which can 
be taken to improve the understanding of the questions by the various 
people filling out the form and thus to compensate as much as possible 
for the lack of flexibility inherent in the technique. If the questionnaire 
is well constructed and good rapport is established in its use, remarkably 
frank answers can be obtained concerning matters which the respondent 
is able to put into words, as shown in a study made under conditions of 
anonymity by Shaffer (710) and in another involving signed question- 
naires by Kemble (.jso). 

Hating scales are a third widely used non-testing technique ot gather¬ 
ing diagnostic data, although they resemble tests in that they attempt to 
quantify evidence and to be objective. A great deal ol leseauh, sum¬ 
marized by Symonds (810: Ch. 3) and from the counselor’s point ot view 
by Strang (7G8: Ch. 6), has demonstrated that despite its objective ap¬ 
pearance the rating scale is a very subjective technique, being funda¬ 
mentally the recording of opinion. Despite this defect, rating scales have 
been found useful in personnel selection (115) and evaluation (538:193- 
197); but judging by the accumulated experience of those who ha\e tiied 
them, they have not proven very helpful to counselors interested in 
getting a picture of the characteristics of students or others with whom 
they are working. 

Anecdotal records resemble some aspects of the better types oi rating 
scales in that they call for descriptions of behavior as observed in con¬ 
crete situations. The American Council on Education Personality Re 
port, for instance, calls for a specific illustration of every characteristic 
rated on the graphic scale: if the student has shown evidence of leader¬ 
ship, the rater is asked to describe a situation in which this was demon¬ 
strated. An anecdotal record differs, in that it consists of a collec tion of 
such incidents described soon after the event and accumulated in the 
subject’s file. If the incidents are well chosen and well described (neither 
of these desiderata can be taken for granted) it is then possible to analv/e 
these records and construct a dynamic and characteristic picture of the 
individual in question and to make judgments concerning his probable 
behavior in other situations. This technique has been studied by Jarvie 
and Ellingson (398), and is described by Strang (768: Ch. 5) and Traxlcr 
(860: Ch. 7). 

Personnel records are another source of diagnostic data available to 
schools, colleges, and business enterprises. The data included therein are 
often so sketchy as to shed little or no light on the abilities, interests. 
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personality traits, background, or family situation of the person in ques¬ 
tion; on the other hand, they frequently include a variety of important 
diagnostic data. In a school or college the student’s courses and grades 
are at least likely to be available, while in an industrial concern his type 
and amount of education, previous employment history, marital status, 
earnings, and attendance are likely to be on record. The case histories 
ot social agencies and credit ratings often provide other material. If the 
lccords go into more detail concerning the subject’s special achievements 
and problems the counselor or personnel worker has at his disposal data 
on proficiency, interests, and personality traits which have the advantage 
of having been accumulated over a period of time and therefore of 
showing trends of development, and which generally reflect the judg¬ 
ment of a variety of people. The principal problem in using personnel 
records is to keep them sufficiently complete without making record 
keeping take time that is needed for diagnosis and counseling. Strang 
(7(18: Ch. 2) discusses the use of personnel records in schools and colleges; 
tiraiment of their business and industrial uses will be found in Scott and 
otheis ((>85: Ch. 8-10) and in Moore (538: Ch. ^). 

Cs.s ays and autobiographies piovide another source ot diagnostic data. 
Counse lors and admissions officers in schools and colleges frequently ask 
students to write an autobiographical sketch, often focussing on theii 
(dmational and vocational experiences and plans, in order to get an 
understanding of their interests and motivation. There has been much 
less systematic study of this technique than of most, despite its wide¬ 
spread use. It is used not only in educational institutions, but also b\ 
foundations granting fellowships; rarely by business enterprises. It h 
hiiefly discussed by Strang (768:113-116) and at somewhat greater length 
*>> Fryer (-’ 77 : 37 '--I» 9 >- 

77 /c Contribution of Tests to l ocational Diagnosis. What has just 
been said should make it clear that psychological tests are only one way 
of obtaining information needed to understand a person whom one is 
counseling. To put it concretely, the intelligence of a young man two 
years out of high school can be judged by an intelligence test adminis¬ 
tered to him especially for that purpose, by his marks in high school, by 
his father’s occupation, by his own occupational experience since leaving 
school, and by various other indices. 

It is true that all of these methods have defects: the test may not truly 
represent his mental ability because of a reading handicap; his high 
school marks may not be a good index because of his poor motivation 
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at that time; his father’s occupation may he the result of social stratifica- 
tion rather than of his own enterprise and ability in a fluid society; and 
his own occupational experience may have been distorted by depression 
conditions. But they also have their own peculiar advantages: the young 
man’s occupational history shows what he has actually done with his 
ability in a situation in which the economic lactois ate known; there is 
a demonstrated relationship between intelligence and occupational level, 
whether the occupation referred to is that of the lather or of the person 
in question; high school marks cot relate to a moderate extent with in¬ 
telligence tests and with subsequent achievement in college; and good 
tests well given are relativeh ftee from extraneous influences and do 
yield a prediction of performance or satisfaction in sonic types of activi¬ 
ties which is as good as anv other index available, sometimes much bettei. 

The well-trained diagnostician therefore uses a variety of techniques 
for gathering data about a pci son he is going to counsel or concerning 
whose admission, employment, upgrading, or telease he is to make a 
recommendation. He uses psychological tests to obtain information con¬ 
cerning aptitudes for analv/ing new situations 01 for using fine* instill¬ 
ments; he checks this evidence against interview mateiial and pcisonncl 
records which indicate what kinds of new situations the client has mca in 
the past and how he has met them, or what courses he has taken and 
what hobbies he has engaged in which requite manual dexter itv and how 
successful he was in these. Ratings and lepotts from iormet teachers or 
employers provide evidence of proficiencv in activities not coveted h\ 
marks and for which no proficiency test data ate available. Thev also 
supply data concerning the* ability of the person concerned to get a Ion; 
with superiors, associates, and subordinates, something not assessable bv 
means of the usual psychological tests. These illustrations could be* ex 
tended indefinitelv, but should be sufficient to illustrate the point that 
testing and non-testing techniques need to be used in combination foi 
the eflectivc gathering of psychological and social data. 

The above discussion presupposes the validity of the psvc hologic al tests 
that arc used, just as it presupposes the validity of the cither methods 
of gathering data and of the data which they yield. Educators and busi 
ness men who are not trained in statistics and in experimental methods, 
and some who are trained in experimentation in other fields but not in 
psychology, often fail to realize that in a blanket questioning of the 
validity of tests they assume the validity of some other criterion or pre¬ 
dictor such as school inaiks, supervisors’ ratings, production records. 01 
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their own judgment. They too often do not know how unreliable or in¬ 
valid these other indices have been shown to be by objective investiga¬ 
tions. Ample evidence on this subject will be presented later in this book, 
in connection with the problem of selecting a criterion and in discussing 
the \alidity ol each test covered in detail. But it is pertinent at this point 
to introduce some evidence of the value of tests in vocational counseling. 

The National Institute for Industrial Psychology has conducted a num¬ 
ber ol studies in England and Scotland over a period of years, in order 
to asm tain the value ol vocational tests in counseling boys and girls in 
their eatly teens who weie leaving school and taking employment. The 
Jesuits haw been consistently favorable to counseling which utilizes test 
data along with other information rather than depending only upon 
traditional sources ol data 0 1,389,.joi). Allen and Smith (11), for ex¬ 
ample. followed up the children who had graduated from four elemen¬ 
tary schools. A control group had been counseled without benefit of test 
data, whereas an experimental group had been tested with a variety of vo¬ 
cational tests and counseled in the light of all tvpes of data. The voca¬ 
tional adjustment ol the experimental group, as evidenced by job stabil- 
itv. satisfaction, earnings, and similar criteria of success, was significantly 
better than that ol the control group. 

Pitfalls in Diagnostic Trstiiig 

Four major tvpes of error are frequently made b\ users of tests. These 
ate t) the neglect of other methods of diagnosis, 2) overemphasis on di¬ 
agnosis with the- resulting tendency to neglect counseling, 3) failuie to 
take* into account the specific validity of the tests used, and, 4) the neglect 
of other me thods of guidance which should normally accompany diag¬ 
nosis and counseling. The fust two pitfalls have already been dealt with 
at some length m this chapter; the third is discussed in the next chapter; 
in concluding this chapter some remarks on the fourth type of error are 
in 01 dei. 

Many of the earlier writers on vocational guidance, working at a time 
when psychological tests were first being developed and when interview¬ 
ing was an unanalv/ed art, were more impressed by the promise of ex¬ 
ploratory activities in school and on the job than they were by diagnosis 
and counseling. Aware of the extremely limited usefulness of the tests of 
their day and of the subjectivity and inadequacy of the interview as then 
used, they had more faith in the ability of the individual to “find him¬ 
self” as a result of exposure to a variety of experiences in his school woik. 
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leisure-time activities, summer jobs, and first few yea is of work than as 
a result of a counselor’s work with him. This point of view is expressed 
as late as 1932 in Brewer’s Education as Guidance (119), the title of which 
indicates its philosophy. 

Not a few more recent writers on vocational guidance have gone to the 
other extreme, particularly those who have had a part in the development 
of vocational tests during the past twenty years. Impressed by the gains 
made in our ability to diagnose and predict, they have tended to empha¬ 
size the role of the counselor or employment manager and to minim i/e 
the importance of exploratory and induction activities. This emphasis 
is shown in the writings of some psychologists of the 1930’s (190.928,931). 

A third, most recent, group of writers have intiocluced still another 
emphasis, that on theiapy or counseling at the expense of diagnosis and 
exploration, the first of which is considered positively harmful while the 
latter is not considered at all , because of the emphasis on personality 
adjustment (594,641). 

Rogers’ and Williamson’s points of view have already been discussed in 
another connection; the point which it is desired to bring out here is that 
both of these newer emphases ha\e minimized the role of exploration b\ 
the individual and the use of exploratory activities by the counselor as a 
means of furthering vocational adjustment. In the opinion of this writci, 
diagnosis and counseling are essential to a program of \ocational guid¬ 
ance, and so is exploration. The effective vocational counselor is one 
who knows when and how to use diagnostic techniques, when and how to 
rely primarily on counseling, and when and how to help the counselee 
engage in activities which will help him to obtain the insights and infor¬ 
mation needed. In industrial and business personnel work also, there 
are circumstances in which good selection is the crucial thing in securing 
well-adjusted employees, otheis in which helping them to nuclei stand 
themselves and their situations better is most important, and still othcis 
in which good induction into the new company and tiv-out in a vaiietv 
of activities arc the key to developing effective employees; the most 
competent personnel man relies on a combination of such procedures. I o 
become so absorbed in the mechanics or dummies of one aspect of voca¬ 
tional guidance or personnel wank as to lose sight ol the others, or to clt 
pend exclusively on one or two rather than using a combination of all 
three, is to impose an unnecessary limitation upon the effectiveness of 
one’s work. 



CHAPTER II 


TESTING ANI) PREDICTION IN VOCATIONAL 
SELECTION 

The Peculiarities of Selection Testing 

ALTHOUGH the tests used in vocational counseling are often identical 
with those used in selection, the ways in which the tests are used have 
generally differed considerably. In vocational counseling, the primary 
objective is the development oi an understanding of an individual by 
himself and incidentally by the counselor, and the relating of personal 
to occupational data. This is by definition a broad task which in our 
picscnt state of knowledge requires considerable dependence on non¬ 
testing techniques and subjectively obtained information concerning 
both counselce and occupations. Perhaps some day the dream of a com¬ 
prehensive battery of tests and of test weights for all the major occupa¬ 
tional fields, described by Clark Hull (385: Ch. 14), will be realized, but 
current opinion is in agreement that both people and occupations arc 
too complex for this to be at all likely. In vocational selection, cm the 
other hand, it has proved possible to rely more heavily on testing pro¬ 
cedures. Familiarity with the reasons for this is essential to the effective 
use of tests in both counseling and selection. 

Fundamental among the factors which make possible greater reliance 
<>n tests in vocational selection is the relative simplicity of validation, that 
is. of checking test results against behavior which one is attempting to 
pi edict. Whereas in counseling one is concerned with a great variety of 
occupations, in selection the focus is on suitability for one or at most 
scxeral somewhat related jobs. The personnel man interested in improv¬ 
ing the selection of employees for certain jobs in his company works with 
a relatively uniform criterion group (men in one job) and with a rela¬ 
tively simple criterion. He is therefore able to make a careful first-hand 
analysis of the activities involved in the job, to select or develop tests 
which seem likely to prove valuable in predicting success in its activities, 
to check up on the actual value of the tests and of other indices such as 
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the judgments of interviewers, and to utilize in his selection program 
the combination of techniques which has actually worked best for the job 
in question. If, for example, the objective is to select ellective operatives 
for a certain type of assembly work, an analysis can be made of the pro¬ 
cesses involved in the assembly and of the skills which seem to be required 
by them. Possible criteria of successful performance can then be exam¬ 
ined, some of them designed to serve as overall indices of success, some 
perhaps selected to serve as measures of success in special aspec ts of the 
work in which specific aptitudes play an important part. In an assembly 
job the overall ciiterion may be the number of assemblies cortectly com¬ 
pleted per working day or other unit of time; specific criteria are not 
likely tea be available in as simple a task as assembly work, although some 
such work can be broken down into processes requiring primal ilv gross 
and fine manual skills, spatial judgment, and perceptual speed. The 
frequentlv forced dependence on one overall criterion of an objective 
t\pe has the advantage of reducing the amount of experimental work, 
but Iras the disadvantage of making it seem deceptively simplex Research 
has shown that production is aflected bv manv factors, including pav- 
ment methods, location of work, type ol supervision, and union policies. 
Despite this fact, the use of vocational tests in selecting emplovees lor 
one tvpe of job in one company, in which most of these- other factors aie 
constant, is made relatively simple by the possibility ol one fair Iv ade¬ 
quate criterion of success. 

A third factor which operates to make the use of tests in vocational 
selection easier and more helpful than in counseling is the lact that the 
personnel man has some control o\’cr the job situation. As he is woiking 
lor the company lor which he is trying to improve employee selection the 
company has a stake in his success, and as he knows the situation in 
which he works, the people whose co-operation he must have, and the 
policies governing their work, he is likely to be able to obtain the c o- 
operation which he needs and to be able to make- changes in policicv 
schedules, and other aspects ol operations in order to achieve his objec¬ 
tives. This improves both the chances of developing good tests and the- 
prospects that the personnel whom he has selected will work under con¬ 
ditions which permit the success ol qualified employees. It should be 
noted, however, that since the user of tests in personnel selection is part 
of an operating agency and must fit in with the operating needs of other 
officials he is subject to pressures which may handicap him in his vvoik. 
\rnong these are the need fear immediate results when preliminary work 
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should be done before applications are made, the lack of sufficient num¬ 
bers of employees in some jobs for adequate standardization and valida¬ 
tion to be possible, the difficulty of obtaining adequate criteria (e.g., the 
impracticability in some situations of training supervisors to rate objec¬ 
tive!}), and the fact that certain operations cannot be interfered with in 
the way necessary to a particular project. 

The fourth factor which generally operates to make possible greater 
dependence on tests in personnel selection than in counseling is the 
practicability and superiority of custom-built tests. Experience has re¬ 
peatedly shown that, when a battery of tests is developed especially for 
use with one job or a group of jobs in the organization, specific local 
lac tors can be taken into account which make the tests more valid than 
tests which have* been developed with more varied applicability in mind. 
I h is is a crucial point which should be borne in mind by every user or 
pote ntial user of vocational tests for selection purposes; given the time 
and tlie- highlv-traineel technical personnel necessary to such work, selec¬ 
tion tests developed cspeciallv for use with certain jobs in a given 
organization are Jikelv te> piove much more valid than more widely ap¬ 
plicable tests. A knowledge of the* nature and validitv of existing tests, 
such as it is the purpose of this book to provide 1 , is essential to good 
testing ol am kind, but the user of tests for selection purposes needs to 
master also the techniques of test construction and validation and to 
applv them to his work, or to obtain the services of a specialist wno can, 
under Ju> general supervision, carry on such work. The next chapter con¬ 
tains a discussion of the logic and methods of test construction and 
validation, hut does not attempt to present the statistical procedures. As 
stated in the- introduction, that should he the subject of another hook. 

An illustration of the- supeiioritv of custom-built tests will help make 
the point that selection-testing is more practicable than guidance testing. 
In selecting and classliving cadets for training as pdots, navigators, and 
bombardiers in the Aimv Air Forces in World War II, some work was 
lone with tests of spatial visualization such as Thurstone’s Surface De¬ 
velopment (316:273) with results which led to the conclusion ilmt 
existing tests of this factor were not promising for aircrew selection 
(ibisr-.if) with flying success). Instead, woik was begun along lines which 
weie suggested by job analyses and which involved tasks and materials 
resembling, at least superficially, the tasks in which success was to be pre¬ 
dicted. One of these tests which factor analysis has shown to be a measure 
<>1 spatial visualization in a way realistic for aviation (316:479-486) was 
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entitled the Instrument Comprehension Test. In it the examinee read 
airplane flight instruments such as the artificial horizon and decided 
which of the presented alternative pictures of an airplane in flight repre¬ 
sented the attitude (position relative to the ground) of the plane indi¬ 
cated by the instruments. This test had validities of .39 and .48 (two 
different parts of the test) for the experimental group referred to above. 

Most clearly a spatial visualization test for aviation, however, was the 
Visualization of Maneuvers lest (316:277-284). The items in this test 
consisted of a stem showing the attitude of an airplane and describing 
the turns, climbs, and dives it next makes, followed by five multiple- 
choice pictures of the same airplane in vaning attitudes. The task was 
.0 choose the alternative which indicated how the plane would be flying 
after completing the maneuvers described. T his would seem logically to 
involve the ability to visualize the relationships of objects in space. 
Anecdotal evidence is available in the observation of experienced pilots 
taking the test and in their comments after taking it; they gesticulate 
with their hands and sway in their seats as they act out the maneuvers 
they are attempting to visualize, and say, afterwards, that they “just 
about twist your hand off trying to do those maneuvcis.“ The correlation 
of this test with success in flying training has been shown to be .23 (316: 
283). These results demonstrate considerable validity for single tests, 
and more than that which characterized the more abstract type of spatial 
visualization tests. 

With the advantages deriving from a relatively uniform situation 
over which he has some control, with a criterion of success which is simple 
enough to permit validation but broad enough to be related to a number 
of different tests, and with the greater similarity between test and 
criterion which results from the ability to use custom-built tests, the 
personnel man working on selection problems can well depend more 
on tests than can the counselor who is trying to help people with voca¬ 
tional choices. 

The Importance of Other Techniques 

Although the psychological factors which can be measured in selection 
are the same as those which can be measured for counseling purposes, 
there is less reason for thinking that non-measurable factors need to be 
measured in selection than in counseling , and more direct evidence to 
justify a greater dependence on the factors which can be measured. 

Numerous studies of the emplovment interview, summarized by Bing- 
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ham and Moore (96), have shown that as they normally work there is 
so little agreement among the judgments of interviewers that employ¬ 
ment interviews have little value. Since the bulk of these studies were 
made, improved techniques have been developed which make possible 
a reasonable degree of agreement between interviewers; these involve 
training interviewers, standardizing the interview situation, focussing on 
certain traits or aspects of behavior most readily observable in the inter¬ 
view, and providing standardized scales for the rating of traits or be¬ 
havior and the notation of substantiating facts. Bingham and Moore 
(96: Ch. 2, 1st ed.) give an illustration of a form of this last type. Despite 
such improvements experience continues to demonstrate that in many 
situations interviewing techniques do not contribute much to prediction 
for specific jobs. For example, an aviation psychologist met regularly 
with a flight surgeon as a member of a board which reviewed the cases of 
soldiers who made borderline scenes on the aviation cadet classification 
tests. This board interviewed these cadets, reviewed relevant material, 
and decided whether they should be sent on to flying training or dis¬ 
qualified on the basis of low aptitude. The board’s judgment was proved 
to be of little value. 'The procedure was soon dropped, and cadets were 
disqualified on the basis of test scores alone. 

Another study was made somewhat earlier by the staff of the same 
Army Air Forces Psychological Research LJnit (316: Ch. 24), in which 
a number of clinical techniques, as contrasted with objective tests, were 
studied in order to determine their validity in predicting success in 
flying training. These techniques included a standardized interview, 
observation of behavior in an informal “rest period" between tests, ob¬ 
servation of behavior in two standardized situations in one of which the 
cadet took an apparatus test by himself and in the other of which he 
worked on a spatial assembly test as one of a group of three examinees, 
ratings of behavior in standard psvehomotor tests, and others. The cor¬ 
relations between ratings based on these techniques and success in pri 
inary flying school were practically zero, except for coefficients of between 
.15 and .20 for the ratings based on observation in Heathers’ Control 
Confusion Test and on Super’s Interaction Test, the two experimental 
situations designed especially to bring out ratable behavior. The inter¬ 
view ratings had no validity, even though made by interviewers who had 
at least the equivalent of a master’s degree in psychology with an 
emphasis on clinical work. The objective tests used in the standard 
selection and classification batterv had validities which ranged from .29 
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to .51 in the experimental group of 1112 cadets (214:191). Dependence 
on tests rather than on interviewers’ or observers’ judgments is clearly 
justified by these two studies, although it is conceivable that a more valid 
interview or observation procedure might be devised and personnel 
trained to use it, as in the work of the Oflice of .Strategic Services (33,55#)- 
Finding time for it would then be the problem when large numbers of 
candidates are involved. The AAF program tested cadets at a cost of 
five dollars per man, whereas the OSS procedure lecjiiircd tlnee and 
one-half days, a hundred-acre farm, and fifteen professional stall members 
for a group of eighteen candidates. 

It should be pointed out that one reason why tests have proved to be* 
more valid than other techniques for gathering and evaluating personal 
data for the prediction of vocational success is that the tests themselves 
have been so constructed as to cover material which is often thought of 
as obtainable only by other methods. It is not meant to imply that the 
tests measured all relevant variables: a multiple coiielation coefficient 
of .(36 (214:191) makes it quite clear that othei factors also weie operating 
in the AAF studies, and the battery of tests avowedly vas i\euh in meas 
ures of personality and temperament. Hut the factual mateiial which is 
normally obtained by means ol inlet views and questionnaiies and then 
interpreted subjectively was obtained in a biographical Data blank 
(316: Ch. 27) devised by Laurance F. Shafler, weighted according to the 
experimentally ascertained importance ol each possible response to each 
question, and scored to yield a measure ol background factors and ex¬ 
periences which {day a part in living success. It had a \aliclity of 33 
(214:191). The technique was not entirely new: it was used in the Ci\il 
Aeronautics Administration testing piogiam by E. Lean ell Ke lly (2(10) 
and prior to that had become a standard method in the selection of 
salesmen by a number of life insu)ante companies. In the- latter, foi 
example, a positive weight was gi\en to affirmative answe rs to questions 
as to whether the examinee was manied, had children, or carried insiu 
ance, since these were found to characterize men who made good sales¬ 
men. 

Work done in recent years by German military psychologists (245), by 
Murray and his colleagues at Harvard before American entry into 
World War II, and by the same investigator and the staff of the Office 
of Strategic Services during that War (33,558) has demonstrated that 
there are possibilities in the development of the standardized situation 
test (sec p. 529 ff.) which should not be neglected in se lec tion programs. 
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nor, for that matter, in counseling programs. The ultimate form of such 
tests may perhaps not be comparable to the paper'and pencil or appa¬ 
ratus tests that we now consider objective; instead, it may combine some 
of the standardized features of the objective test with some of the sub¬ 
jective features of the interview. But, in improving their validity by 
standardizing the situation and the method of evaluation, psychologists 
take them out of the category of non-testing techniques and into that of 
testing techniques. A book of this type written ten or twenty years from 
now may well need to devote a great many pages to the discussion of 
such slandardi/od life-situation tests. At present they are experimental 
and of unknown validity, and so are briefly considered only as a promis¬ 
ing technique for the evaluation of personality. 


77 /c Validity of Selection Tests 

I he problems and methods of validating tests for selection and coun¬ 
seling put poses an* taken up in the next chapter. It is pertinent here, 
however, to examine the evidence concerning the value of tests in the 
selection o{ employes, lor what has been said on that score while con¬ 
sulting the limitation of other techniques has been piecemeal and 
inc omplete. 

Wot king with applicants for employment with a utilities company, 
Wadswoith (per,) gave two intelligence tests to an experimental group 
and no tests to a contiol gioup, the former numbering 108 and the latter 
;,p j men and women. Aftei employment, by the usual methods in the case 
of the non-tested applicants, data were gathered concerning their success 
on the job. Employees were classified as outstanding, satisfactory, or 
problem emplovees. I he lcsults, given in Table 1, show the superiority 
of test-selected personnel in tin's one enterprise, as only 5.5 percent of 
the 1 latter were considered problem employees as contrasted with 29 
percent of the non-test selected group. 

Tabu: i 

TEST-SELECTED l.MPl OYEES IN A UTILITY COMPANY PROVED 
SATISrACTORY MORE OFTEN TUAN OTHERS 



lest- 

Eon-Test- 

Iype Employee 

Selected 

Selected 

Outstanding 

33% 

22% 

Satisfactory 

61.5% 

49% 

Problem 

5-5% 

29% 

Total Number 

IO8 

594 
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Strong used a different type of test with a different type of employment, 
obtaining his data iit the somewhat less satisfactory manner of testing 
employees already on the job (775 :487—498). Despite this his data are 
impressive, and there is no reason to think that they would have been 
different if testing had preceded employment. Relevant to this topic 
is his finding that 56 percent of the life insurance salesmen who scored 
A on his life insurance salesman’s scale sold $150,000 worth of insurance 
per year (enough to yield a living in commissions at that time), whereas 
only 6 percent of those who made scores of C sold that much insurance. 

Finally, data from the army a\iation testing program of World War II 
might be cited, because of the unusually large numbers tested, the ex¬ 
tensive batteries of tests involved, and the nature of the criteria used. 
Figure 1 shows the percentage of cadets at each ability level (determined 
by tests) who were eliminated from primary Hying training, the first nine 
weeks of actual flying as a student pilot. The trend is obvious at once: 
the short bar at the top shows that only four percent of the 21 .*171 cadets 
who entered training between October 1942 and December igjj with 
pilot stanines of nine (standard scores expressed on a nine-point scale) 
were eliminated from primary flying school because of flying deficiency, 
fear, or their own request, whereas the long bar at the bottom of the 
graph shows that 77 percent of the 90 \ cadets who entered training din¬ 
ing that same period with pilot stanines of one were eliminated. These 
low-scoring cadets were less numerous than the high-scoring, because of 
the raising of requirements as the use of tests became more completeh 
accepted and as the progress of the war made smaller quotas of new 
pilots possible. By the end of the war it was possible to accept for pilot 
training only cadets with pilot stanines of seven. This meant that, in¬ 
stead of an elimination rate of 24 percent as in this group of 185,4(17 in 
the middle two years of the war, only 10 percent would be eliminated if 
other factors remained constant. 

Even more conclusive evidence is available from the experimental 
group described in Report No. 2 of the Aviation Psychology series (214) 
and by Flanagan (264). As has been previously stated, this group was 
selected without reference to test scores, the only official requirement 
being the passing of the physical examination. Actually, the group was 
also somewhat selected according to traditional methods, as they were 
accepted at a time when the normally enforced standards were well 
known and the men presumably applied with the thought that they 
could meet them. This is shown by the fact that only 23 percent were not 
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Pilot Number 

Stanine of Men 

9 21,474 


8 19,440 

7 32,129 


6 39,398 


5 34,975 


4 23,699 


3 11,209 


2 2,139 


1 904 

Total 185,367 

TEST SCORI S AND SUCCESS IN AAF PRIMARY PILOT TRAINING 
The bars indicate the percentage eliminated, at each pilot stanine 
(combined test score), for inability to flv, fear, and at own request. 

Credit for flying experience is included in the stanine. Data are for 
classes trained during 1943 (when some low stanine men were ad¬ 
mitted), 1944, and 1945. After Flanagan (264:76). 

at least high school graduates, as contrasted with 37 percent of men- 
in-general at that age (61). W hereas the selection and classification tests 
normally admitted to training only one failure to every three or four 
successes, the non-test selected experimental group included one failuie 
for every success. If, as there is reason to believe, other things such as the 
strictness of instructors, check riders, and elimination boards remained 
relatively constant, the use of tests was clearly an improvement over 
selecting merely on the basis of physical examination and, to a lesser 
extent, education. 

Programs of Testing for Selection, Placement, and Upgrading 

Despite the evidence which shows that subjective methods of evaluat¬ 
ing applicants for employment add little or nothing to the predictive 
value of well-constructed and validated objective tests, personnel men 


Percent Eliminated in Primary Pilot Training 

9 - _ 32 _ & _ *2 _§2—i§2_ 32 _ SSL 9 Q log 




10% 




224 


30% 


40 % 


53 ^ 


67% 


77% 


24% eliminated 

Figure 











24 


APPRAISING VOCATIONAL FITNLSS 


and vocational psychologists continue to utilize interviews, application 
blanks, rating scales and letters of recommendation in selecting em¬ 
ployees. This is partly because of an unreasoning distrust of purely ob¬ 
jective methods, partly because of the knowledge that even the best ol 
test batteries do not cover everything and the hope that other methods 
will supplement them, and also because, in practice, tests ate oltcn 
used without the thoroughgoing standardization and validation pro¬ 
cedure which is necessary before one can know just how valid they are 
and whether or not selection is in fact improved by supplementing tests 
with other techniques. 

When job analyses have been made the emphasis in testing is likely to 
be on placement on the right type of job; when differential ability data 
are lacking, it is likely to be on selection of generally promising em¬ 
ployees. 

One large corporation, to cite a concrete instance, uses psychologic al 
tests in three of its divisions. In one di\ision of this corporation a new 
plant had been built and the personnel director was told that the 
management wanted to make it a model plant, lie was accordinglv di¬ 
rected to devise a battery ol tests which would be appiopiiate to the 
jobs to be fdled, and to select employees on the basis ol tests and other 
data from the beginning. As is frequently the case 1 in actual operations, 
the pressure of the situation, that is, the need lor selecting emplovees on 
some basis and the belief that even tests which had not been validated 
in that plant would help in the selection of better employees than would 
be selected without test results, caused the use of tests without the benefit 
of the scientific preliminaiies which aie usually considered desiiablc. The 
personnel director therefore put into use a battery of tests which, judging 
by results in other plants in which somewhat similar work was done, 
seemed likely to piovc valuable. They were used in an attempt to e\c lucle 
from any type ol job the most awkward, most maladjusted, and least in¬ 
telligent, that is, for selection. At the same time provisions were made for 
the gathering of data concerning the success of the new emplovees on 
their jobs. Although the use of the tests in making decisions concerning 
selection could be expected to reduce the range of abilities in any one job, 
it was felt that the shortage of labot would result in a spread of abilities 
sufficient to reveai whether or not a relationship existed between test 
scores and job success. In such a situation it was only natural that pievi- 
ous experience, schooling, and similar background factors were weighted 
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cjiiiu* heavily by tlu* employment managei in going over the results oi 
tests, interviews, application blanks, and letters ol 1 ecommendation. 

In another division the psyOtologist in charge ol testing began bv 
making a systematic analysis ol the jobs in cjuestion, using standaid job 
psychographic technicjues. He then selected and devised tests which he 
thouglit would be effecti\e measures of the characteristics which ap¬ 
peared to diflerentiate the major types of jobs. I he experimental batten 
of tests was administered to all applicants lor factory employment and, 
as data accumulated, the test results wete correlated with supenisois’ 
ratings in orclei to determine their actual \alue in selection. One test was 
found to add nothing to the predictive value of the batten, but, as it 
took little testing time and appealed to applicants and foremen, it was rc - 
rained; other tests which had some \alue were weighted accordingh and 
used in selection and in placement in appropriate jobs. The \alidities 
of the- batten average about .r,o and, at the time* of writing, are based on 
rather small gioups. No pusonal histon or biographical data form ol the 
type discussed earlier in this chapter is used, lire testing program is still 
lelatiwlv new in this plain. For these reasons the employment inter \ lew¬ 
is depended upon lather hca\ilv, and decisions are made alter the back¬ 
ground and manner oi the applicant ha\e been mentalh (rather than 
stat tstic ally) weighted in combination with the test results bv the em- 
plowneut manager, with the emphasis on placement in a suitable job. 

I he third division ol this corporation operated in a part ol the coun¬ 
try in which the lahot shortages resulting from wartime and postwar 
developments were serious. In practice, emplovee selection became more 
a matter ol employee placement. The personnel manager therefore 
sc It c ted a batten oi tests without regard to special aptitudes and abilities 
such as might he important in selecting lor oi in placing people in 
clilletcnt types ol jobs; believing that even selective placement was gen¬ 
et ally out ol the cjuestion in that plant, the emphasis was placed on tests 
ol certain basic genet al factors the nuclei standing of which would help 
foremen and supenisois to induet and handle the new employee more 
effectively. The employment battery thereiore consisted ol a test ol gen¬ 
eral intelligence. a measure ot personality adjustment, and a measure ol 
vocational interests. The nature ot the tests was e xplained to supervisors, 
the scores oi each new employee were discussed with them, and they were 
helped to understand the types of adjustment problems which the new 
employee might encounter. It was believed that the supervisors’ interest 
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in and intelligent use of this information was an important factor in the 
development of satisfactory employees, although no objective evidence 
was gathered on this subject. 

Psychological tests are frequently put to use in business and industrial 
personnel work for the upgrading of personnel, that is, the evaluation of 
employees for possible promotion to more responsible positions. In this 
type of work two approaches are possible, one of them comparable to 
selection testing, the other to placement testing. In the former, tests and 
other techniques arc used which will throw light on the general promise 
of the persons in question: their general intelligence, personality ad¬ 
justment, leadership, and similar general characteristics are assessed by 
means of tests, inventories, ratings by superiors, and interviews. In the 
latter, data are gathered by similar methods, but they arc data about 
special abilities, interests, and personality traits that are known or thought 
to be important to success in specific jobs at higher levels. 

For example, a number of aviation psychologists worked under the 
leadership of John C. Flanagan in the American Institute lor Research, 
on the evaluation of airline first officers for possible ptomotion to cap¬ 
taincies. In this program an analysis was made of the abilities and char¬ 
acteristics needed by the captain of a commercial airliner. Tests were 
selected which previous work with pilots had demonstrated to be cor¬ 
related with success in flying twin and four engine planes; others were 
constructed to measure characteristics not covered by existing tests; and 
interview procedures were developed for tapping other factors which 
could most effectively be assessed in face-to-face contacts. Techniques for 
quantifying the results of interviews were developed, and the results 
obtained by any one interviewer were so treated as to make them com¬ 
parable to the results obtained by others, thereby minimizing the sub¬ 
jective elements. At the same time, the flight records and ratings of first 
officers by captains and check pilots were utilized as objective measures 
of proficiency and achievement, after they had been subjected to a 
statistical study which demonstrated their reliability and validity. T he 
resulting data were weighted to provide an overall score indicative of the 
pilot’s promise as a captain; this, and a three hundred word sketch 
verbally summarizing the first officer’s assets and liabilities and pointing 
out how they might be respectively utilized and corrected in this and 
other possible jobs, were turned over to company personnel officers for 
use in making decisions. 

In such a program tests play an important part in assessing character 
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istics which are not called for in the job currently held, or the exercise 
of which cannot be well observed on the job. They help to isolate factors 
which, even though observable in the employee at work, are so inter¬ 
twined with other factors that the observer has difficulty in determining 
the relative importance of a given strength or weakness. And. finally, 
they are free from the taint of possible bias 
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METHODS OF TEST CONSTRUCTION, 
STANDARDIZATION, AND VALIDATION 


TO BE full) competent in the use ol vocational tests it is nciessats to 
know all stages and t}pes of work with tests. This does not mean that the 
\ocational counselor or personnel diiector must he an expel t in test 
<onsti notion, nor that the developer ol tests must also he* cxpei < in using 
them in counseling or selection. But it does mean that the \o<ati<mal 
counselor must be lamiliar with the proceduies and jjiobleiih ol test 
construction, and that the technician whose function it is to de\elop 
tests must understand their use in counseling and selection, il the tools 
essential to diagnosis aie to be worth using and a\ t*l 1 used. It is theieloie 
the j)Ui]>osc‘ ol this chapter, not to piovide a manual ol test < onsti uc tion, 
but rathei an oiientation to test constitution which will enable the user 
of tests in counseling and personnel evaluation to lead the published 
test research with a critical appreciation of the* problems boohed and 
thus to understand more completely the meaning ol the results obtained 
when using tests. 

The development of a vocational test can be broken down into se\en 
major steps. These are: job anahsis, selection ol traits to test, selection 
ol criteria ol success, item consti action, standardization, \alidation, and 
cross-validation. In any given test construction piojcct one oi mote ol 
these steps may conceivably be slighted or omitted altogether: wlun this 
is the case, however, it should be because sufficient woi k has aheady 
been clone along those lines to provide- a basis for the next step. 01 be¬ 
cause the pressure ol time and circumstances makes the- taking ol short 
cuts necessary and dependence on hunches seem wise. The critical reader 
must judge lor himself whether or not the omission ol the steps was 
justifiable and whether or not the data are usable. The seven steps wall 
now^ be taken up in some detail. 

28 
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Job Analysis 

Before tests can be selected or constructed for the measurement ol 
aptitude or personality traits which affect success or satisfaction, ii is 
necessary to lia\e an understanding of the characteristics and abilities 
which play a pai t in the woik in cjucstion. The process of collecting and 
analyzing inhumation which provides this understanding is called job 
analysis. Whethet it is done scientifically or otherwise, some type of job 
analysis has to be- peifoimed before an aptitude lest can be constructed. 
It may be an armchair analysis, in which the test constructor draws on 
his familiarity with the job or occupation for which tests are being con 
strutted in order to set up hypotheses as to the characteristics which 
make lor success in that work. It may involve bibliographical research, 
to ascertain what others have thought or found to be important in that 
occupation. It may be an analysis of manuals used in the training ol 
people for the woik in question, in order to judge the abilities needed in 
mastering the* fundamental skills. It may imohe discussing it with super¬ 
visors, observing and interviewing workers doing the work, trying the 
operations onesell, or even learning the job and working at it lor a 
pei iod. 

In analyzing the work of military pilots a combination of these methods 
was used as time and circumstances permitted. First, f. C. Flanagan 
anuh/cd the proceedings of boards which eliminated failing aviation 
cadets bom primal v living training, in order to ascertain the reasons 
given lor their failure hv the boards. This resulted in a list ol character¬ 
istics tanging born lack ol co-ordination to poor motivation, and a table 
showing the incidence of each of these reasons in a large sample ol 
eliminees. Then ). K. Hemphill, drawing on his own experience a 
civilian liver, and the writer, depending on observations ol militan 
pilots at work and demonstrations ol living in which he performed some 
ol the operations, made an analysis ol training manuals in order to dc 
scribe the pilot's tasks as a basis for setting up hvpotheses concerning 
characteristics which would make for success in learning to fly. Alter ih s, 
N. E. Miller, [. L. Wallen, and the writer went to a military flying school 
in which Miller and Super worked as participant observers, living in 
barracks with the cadets, attending ground school and physical training, 
handling planes on the flight line, learning to fly, and being graded for 
their flying on the same basis as cadets. Wallen worked in the station 
hospital, administering clinical tests to the cadets being studied, inter- 
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viewing them concerning their background and development, and col- 
iecting other types of information from hospital, training, disciplinary, 
and other records. All three job analysts kept notes concerning the ob¬ 
served behavior of the twenty cadets of whom intensive case studies were 
being made, whether on the flight line, in the barracks, or on “ope n post” 
in the nearby town. The two investigators who were flying kept detailed 
records of their own experiences in learning how to fly. lliese materials 
provided a basis for detailed study of the task of learning to fly, of emo¬ 
tional aspects of the experience of learning to ity, and of factors which 
made learning to fly easier or more difficult for a random sample of 
cadets. P. L. Fitts interviewed the returned members of a bombardment 
squadron in order to get their account of the nature and requirements of 
combat flying, analyzed the material, and made it available to a\iation 
psychologists working on test construction. Flanagan spent some time in 
a combat theater studying records, interviewing livers , and flying a num¬ 
ber of missions in older to analyze the task of combat flying at first hand. 
Later, research detachments conducted similar investigations on a larger 
scale in most theaters of the war (167). 

The above description of job analysis activities in one practical situa¬ 
tion is given in order to illustrate the variety of approaches that may be 
used in the study of the nature and requirements of a job or an occupa¬ 
tion. In practice there is not necessarily one method of job analysis; it is 
more likely that there are several which will yield valuable information, 
and that more than one must be used if adequate data are to be made 
available as a basis for selecting or devising tests. The brief survey of the 
development of job analysis methods which follows will bear this out. 

The scientific analysis of jobs was begun early in this century by 
Frederick \V. Taylor (811) as a means of increasing the productivity and 
facilitating the work of industrial employees. It was soon seized iqxm 
by psychologists as a method of astenabling in a preliminary way the 
abilities and traits needed in an occupation and thus of providing a 
basis for test construction. Taylor’s methods, and those of Gilbreth (289) 
and other workers whose interest was primarily in engineering, empha¬ 
sized time and motion study; the picture of a job derived from such work 
therefore proved to be too narrow in its viewpoint for personnel work, 
leaving out of consideration such things as the education and training re¬ 
quired of the worker, the interests which might find outlet in the activity, 
and the environment in which the work is done. They also provided too 
detailed a picture of the manual operations involved in the work, al- 
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though Cohen and Strauss (162) have used the technique effectively in 
studying manual dexterity. Other methods were therefore resorted to, 
in an attempt to obtain information which would provide a suitable 
basis for test construction. 

One of these was the job psychographic method developed by Vitcles 
(581,899,901). It begins with a description of the occupation, describing 
the duties performed, the nature and conditions of the woik, the tiain 
ing involved, the related jobs from which workers may be recruited and 
to which they may be promoted, the advantages and disadvantages oI 
the work, and the personal, physical, educational, temperamental, and 
experience requirements of the job. 1 his material is gathered bv obsen- 
ing the performance of the work and bv interviewing workmen and 
supervisors. So far, this is the standard job description or position de¬ 
scription technique. In order to objectify the analysis of the job Vitcles 
developed a standard list of 32 abilities which are rated on a five-point 
scale by the analyst; the list consists of such factors as energy, co-ordina¬ 
tion, visual discrimination, and logical analysis. The ratings, placed on 
a graphic scale, yield a profile of the abilities required by a job and give 
their name to the method. 

The most recent form of job analysis, adapted especially to vocational 
guidance because it deals with broadly rather than with narrowly defined 
jobs, is that widely applied by the Occupational Analysis Division ol 
the United States Employment Service under Carrol L. Shartle (714: 
Ch. 11). Items which have a bearing on test construction include a 
description of the work performed, the amount and type of supervision 
received, the responsibility, knowledge, initiative, alertness, judgment, 
dexterity, and accuracy involved, the tools used, production standards, 
working conditions, physical demands, and other characteristics required 
for performance of the work. The use of this procedure, like Vitcles’, 
yields a list of abilities and traits which are considered important in the 
occupation or job being studied. 

Selection of Traits to Be Tested 

The analysis of the job provides the test constructor with a list of 
aptitudes and traits which arc deemed important in that job. But this 
list is subject to two serious limitations. These are the subjectivity of the 
evidence and the uncertainty that a particular factor, even if it proves to 
be important, will differentiate this job from others. The fact that ability 
to get along with others is thought important in a given job is, for cx- 
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ample, ascertained only by the analyst’s observations 01 by opinions 

tiansmitted to him by persons who know the woik. The data aie no more 
reliable than the judgment of the people gathering or supplying them. 
Furthermore, it the presence of the trait is subjectively ascertained in the 
lust place, there may be no objective method o 1 assessing it, lor it may be 
a characteristic which has so lar eluded the attempts at measurement. 

(>ranting that ability to get along with others is a prerequisite of the 
job being studied, there is still a question as to whether or not it dil- 
lerentiates this job bom others. There are many jobs which leejuiie 
ability te> gel along with others; e\en if this trait could be measuied, its 
measurement might contribute little that is of value* to differential 
diagnosis and prediction. 

Once the job anahsis is complete and the list of presumably impoi taut 
e harac tci istics is available, the first task ol the test constructor is to make 
some decision as to, i) the relative importance of each trait 01 apti¬ 
tude, a) the availability of a suitable criterion against which to validate 
a test of this trait, L \) the chances that a given tiait is important in tins 
job and unimportant in others with which he is also concerned, j) the 
unavailability ol some reliable* and economical non-testing technique lor 
judging this characteristic, and. ^ the prospects of his being able to 
locate ot devise a test which provides an objective measure* ol the chai- 
acteristic in question. I he job analysis should prov ide evidence ol a 
subjective type concerning the first point, as, for example', in Vitelcs’ 
psychographs. I he next section deals with the impoi taut problems which 
arise in connection with the choice ol criteria. A compaiison of the* job 
analysis data lor the job in question with available evidence from other- 
jobs should provide a basis lot judgment of the third point. In cornier- 
tion with the lout ill point, the use of school grades and supervisors’ 
ratings should be considered, hor the filth, the psychologist must be we ll 
acquainted with the various types of tests which are already in existence 1 
and with the extensive literature on test construction in which abortive 
as well as successlul efforts at test construction have been described. In 
the light of these considerations, the psychologist is able to draw up a list 
of aptitudes, skills, and personality traits ranked in the order ol the like¬ 
lihood with which they may be successfully studied. 

Selection of the Criteini of Success 

Jenkins (400) has pointed out that the events of World War I taught 
American psychologists the necessity ol validation, the next two decades 
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(aught them much about the technique of validation, and that Woild 
War II drove home the necessity ol devoting much time and thought to 
the basis ot validation. In most of the test validity research of the iqxu’s 
and 1930s much space is given to descriptions of the technique of test 
construe lion, the methods of securing data, the description of the cri¬ 
terion used, and the results of the relating of test scores to auction data. 
Not infrequently cane of these topics is somewhat neglected—that in 
which the ctiterion is described. But, even when the criterion is ade¬ 
quately described , too little attention is paid to its adequacy as an index 
of success. 

This lack of emphasis on the criterion can be illustrated by a stitch 
(ftfu)) in which the gioup of aircraft factory inspectois on which the 
battery ol tests was validated were not defined as to tvpe of material in 
spec ted, sex, or age, and were described as “probably re preservative” with 
no suppoiting statistical analysis; the raters who made the ciiterion 
judgments knew the subjects as students in a refrcshei course-, but knee 
their job pc i ioi mance 011Iv in “most” cases; the latings of two iiistructois 
had an inteicoiielation ol .yy, and their correlation with subsequent 
latings by supervisors was ..J2. Some of the- data just presented aie quite 
adequate, the- interconelations of ratings being quite high lor such ma¬ 
terial; and u t it should be obvious that, with no 11101 c attention devoted 
to the criterion than in this study, it is difficult to interpret the results 
For example: specifically what type of performance was rated, that it 
correlated highly with intelligence (.fig) and only moderately (.3with 
mechanical comprehension? Data foi engine and fuselage inspection 
might dilfer. Was the immediate criterion (instructors’ ratings) onlv 
modelaiclv related to the ultimate criterion (supervisors’ ratings) because 
ol low ic liability of the latter, lack of common lac tors in the insti uc!ional 
and wenk situations, or some other uninvestigated Factor? Admitted!) the 
judgments of the instructors are one tvpe of evidence that is available 
carl) in the new employees’* job experience, but how valid a criterion 
is it, that is, how good a measure is it of what the tests are trying to 
predict? If a test has a correlation of .b \ with the immediate criterion, 
and the immediate criterion has a relationship of only . J2 with the ulti¬ 
mate ciiterion, the relationship between the first predictor and the 
ultimate ciiterion is not very high. A more thorough study of the nature 
and meaning of the criterion serves to clarilv issues and suggest better 
predictive devices. At the same time, it is true that whether or not it is 
desirable to devote time and personnel to such a study depends on other 
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factors in the situation, e.g., the savings that would be effected by im¬ 
proved procedures. 

The typical but unwise procedure in test construction is, too often, to 
leave the detailed consideration of a criterion until somewhat later in 
the process than has been done in this discussion. Usually, having de¬ 
cided what factors he should try to test, the psychologist has proceeded 
to develop suitable tests, administer them to appropriate subjects, and 
then for the First time seriously consider the problem of criteria. 7 Te 
vague ideas that he has so far had are now crystallized, the most readily 
available index of success is used with little or no investigation beyond 
a cursory check on its reliability, and the relationship is computed. 

The experience of Naval a\iation psychologists summarized by Jenkins 
in the paper referred to above, and the experience of Army aviation 
psychologists summarized by R. L. Thorndike (833), suggest that the 
order of the steps taken in test construction needs to be changed, and 
that considerable emphasis needs to be put on the problem of selecting 
and evaluating a ciiterion early in the process. Once the 11 aits to be 
measured ha\e been determined, attention should be turned to the* 
selection of a criterion and to the refinement of methods of collecting 
evidence against which the tests to be developed can be validated. The 
discussion which follows describes the major tspes of information which 
are used as indices of vocational success, indicates some of their stiengths 
and weaknesses, and illustrates them from research. In doing so. it relies 
to a considerable extent upon the work of aviation psychologists in 
World War II, partly because the nature of the aviation psychology 
programs, both as to problems faced and staff available to study them, 
makes them an especially good source of such material. Illustrations ate 
also taken from studies in the field of industry and education. 

Thorndike (833: Ch. 4), Huiiini (38b) and others have distinguished 
between immediate, intermediate, and ultimate criteiia. In militarv 
aviation these are respectively illustrated by such evidence as ability to 
complete training as a bombardier, accuracy of bombing (indicated bv 
average circular error) on the practice tange in operational training, and 
accuracy of bombing in combat. Immediate criteria are generally partial, 
that is, they tend to emphasize limited aspects of performance. If grades 
in medical school, for example, are used as an index of success, some men 
with good academic ability but poor social adjustment will be rated as 
more successful than certain other students with somewhat less academic 
ability but superior social adjustment, whereas if an ultimate criterion 
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of success in the practice of medicine can be utilized the latter may prove 
to be more successful than the former. Conversely, ultimate criteria are 
more complex than immediate or intermediate indices of success; for 
this reason, as well as because of the pressure of time, they are rarely 
used in test validation. In the case of military pilots, for example, it was 
necessary to put a classification program into operation on a large scale 
shortly after the bombing of Pearl Harbor. This meant that there was 
no time in which to gather data on the subsequent combat success of 
cadets before establishing weights for tire experimental tests. Collecting 
such data actually took more than two years. Instead it was necessan 
to use an immediate criterion, in this case e\ i tie nee of the cadet’s abilitv 
to graduate lrom primary living school, which became available in about 
five* months. This is by no means a simple criterion, as it is affected by 
a variety of factors such as the cadet’s various abilities and personality 
traits, the attitudes of the instructors under whom he works, and the ex¬ 
tent to which the school he attends adheres to or deviates from estab¬ 
lished practices and standards. But it is not an ultimate criterion, as 
ability to complete the first stage of living training is not necessarilv 
identical with ability to outfly enemy pilots or to withstand the greater 
and more enduring stresses of battle. Since pilots who cannot complete 
training never get to combat the criterion is, however, suitable in a 
negative way. I he same argument applies to the selection or guidance of 
physicians, teachers, and am other group which must surmount a train¬ 
ing hmdie before they can compete in practice. 

The fust c haractcristic to be sought in selecting a criterion is relevance. 
II the immediate criterion is to be a valid one, it must adeejuately repre¬ 
sent important aspects ol the ultimate criterion. II success in completing 
training is to be a suitable immediate criterion, the activities and re¬ 
quirements ol the training program must resemble those ol the job. 
Fortunately, the job analysis should provide a fairly good basis for a 
subjective judgment of this matter'. Jenkins (joo) cites the case of aerial 
gunnery, in which intelligence test scores were found to correlate highlv 
with grades in training, and might therefore have been assumed to pre¬ 
dict success in actual combat; but when the curriculum was revised n. 
make it less abstract and more practical the correlation between in¬ 
telligence and grades fell to zero. 

A second characteristic of a good criterion is reliability (see page by 1 
for definition). 'Thorndike (S.TTS-i) ^ las pointed out that although high 
reliability is not essential in a criterion, provided it is stable enough to 
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reveal the existence of a relationship, the moie reliable the elite]ion js 
the more clearly the degree of the relationship is demonstrated. Low 
reliability is caused by intrinsic factors such as the inconsistency ol the 
performance which is being studied, and by extrinsic factors such as 
variability in the conditions of work, the lack ol agreement between 
raters either in the use of terms or in the interpretation of behavior, and 
bias in the situation. An illustration of inconsistent performance is pro¬ 
vided by an anahsis of errors in determining the position of an airplane 
at key points in the mission (833.\pj), which showed that the number ol 
such errors made in one mission has no relationship to the number of 
errors made in the next mission. As the reliability of performance on a 
single mission was considerably higher, it is probable that both the 
inconsistency of the performance of such a complex task and variations 
in external conditions played a part in the unreliability of performance 
from one mission to the next. reliability in the conditions of woik, in 
these same aviation studies, consisted of such factors as temperature, 
visibility of targets, and turbulence ol the air and consequent instability 
of the navigator’s and bombardier’s working platform. In business and 
industrial studies such variations are illustrated by differences between 
selling on an open floor on which the customer can approach the* mer¬ 
chandise and the clerk can use Iris skill in approaching the customer, and 
selling behind a counter where the clerk can met civ await the customer 
in a more passive way, or by differences in supervision which aflcct the 
attitudes and output of the workers. Meltzcr (524) has lor example re¬ 
ported a study in which the Minnesota Rate of Manipulation Test 
(Placing) had a correlation of —.27 with output under one management, 
and of more than .20 in the same department under a different t\pe of 
management arrd with the different attitudes which it engendered. The 
lack of agreement, between raters is so well-known a lac tor that it hardly 
needs elaboration: Jenkins (pro) mentions a study in which Naval avia¬ 
tion cadets were given successive check {lights bv two experienced in¬ 
structors, with a correlation coefficient of approximately zero for the two 
sets of grades. Bias in the situation is well illustrated by differing stand¬ 
ards in the judgment of performance in different training institutions 
from which graduation is the index of success, for example in traditional 
academic colleges on the one hand and in progressive colleges which 
emphasize more than intellectual accomplishment on the other. 

Criteria may be classified as proficiency measures, output records , 
ratings, self-ratings , administrative acts, and internal consistency mens- 
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ures. As Thorndike points out in his volume of the aviation psychology 
series (833), some of these are enduring records which can he sc01 ed with 
perfect agreement by different workers at different times (the first two 
categories), such as answers to a multiple-choice test or hits on a target; 
some have no enduring record but can be recorded objectively by an 
observer (administrative acts, ratings and anecdotes), such as number 
of bounces in landing a plane or number of customers appioachcd; and 
some are subjective evaluations for which no objective evidence 1 of an\ 
type is available save the overall impression in the observer’s mind 
(ratings). Some discussion of each of these categories, with illustrations 
of their use, should provide a better understanding of the validity of 
tests. 

Proficiency as measured by tests of information and skill irr the per¬ 
formance of a task is sometimes used as an index of success. In some oc¬ 
cupations, the work of which closed) resembles tire work of the* prohci 
cncy test, this tvpe of criterion may he cjuite appropriate. The work of 
a navigator in flight resembles that of the student of navigation in the 
classroom m mans important lespccts, even though it mav diflei in so¬ 
lar* as working conditions aie concerned. 1 he computations and instru¬ 
ments, and even the sequence in which they are used, can be made the 
same in the classroom or group test as in the airplane. I his logical an¬ 
alysis is borne out bv a correlation of ..jq between final examinations in 
giound school and final average grade for missions (265: r21?). although 
the coefficient rs low enough to make it clear that there are factors oper¬ 
ating in flight which do not operate in the classioom, probably factois 
of an emotional and perceptual nature. In main other occupations the 
pioficicncy test situation is too unlike that in which the actual woik is 
pet formed lor it to seem a satisfactory criterion: knowledge of the oper¬ 
ation of a .yo caliber machine gun, for example, would not appear to 
involve the same aptitudes and skills as ability to hit a moving target 
with it while standing on an unstable moving platlorm. Hefore an 
achievement test can be considered a good criterion of success, an analv 
sis of the job and of the factors covered by the test is necessary. 

Output can be gauged in a number of wavs, varving with the nature* 
of tlie task. In a production job it mav be the number of units produced 
per hour', when her the units are identical pans turned on a lathe 01 
pounds of butter wrapped, or it mav be the average earnings ovei a 
given peiiocl when wages are based at least in part on volume produced. 
In a sales job it mav be the nurnbei ol units sold or the dollar value of 
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the total sales, or a ratio of sales income to sales expense. In military 
aviation it may be the number of hits on a target in gunnery, the average 
circular error in bombing, or the number of planes shot clown by fighter 
pilots or gunners. Criteria such as these seem delightfully concrete and 
objective at first glance, but one of the bitter lessons learned by applied 
psychologists engaged in test construction work is that the appearance ol 
objectivity is frequently deceptive. 

Investigations of incentive systems have shown (514,637), for example, 
that the output of industrial workers is often governed by factors other 
than individual differences in abilities or motivation and that artificial 
limits a re often set upon the amount produced per worker per hour. 
A detailed study by Rot he (653.65*1) showed that individual daily work 
curves of but ter-wrappers vary greatly, but that nevertheless group trend 
lines were a stable and usable criterion. He found no evidence of restric¬ 
tion of output in his subjects. In sales work difierences in territories, in 
type of clientele, and in the aspirations and circumstances of the salesmen 
often attenuate the relationship between volume of sales and abilities. 
Strong (772) investigated the case of a life insurance salesman whose 
annual sales were not as great as would have been anticipated ol one 
with a test score as high as his. It developed that he had a private income 
and therefore aspired to sell only enough insurance to supplement his 
income. In executive jobs company policies greatly affect the amount 
earned: E. L. Thorndike (831:86) reports the cases of two presidents of 
equally important and well-known corporations, one of whom received 
a salary of $420,000 per annum, the other $125,000. 

While making a job analysis of flying it occurred to the writer and a 
colleague that a pilot’s ability to hit a target in air-to-ground and in air-to- 
air firing should be a good index of flying skill, as the fixed gunnery 
engaged in by a fighter pilot involves pointing the airplane and main¬ 
taining it as a steady platform while squeezing the trigger. It would, it 
was thought, have the unique advantage of being an entirely objective 
index of flying skill obtainable before combat. It had the further advan¬ 
tage that gun-camera photographs could be used, further simplifying 
and objectifying the scoring. After some preliminary studies a large scale 
study made under Neal E. Miller’s supervision at Randolph Field showed 
that the reliability of air-to-air gunnery scores was .63 when 1200 rounds 
were fired, and that air-to-ground scores had a reliability of .59 when 
based on 400 rounds (833:52). While these reliabilities are high enough 
lor use in validation studies, they are surprisingly low for something as 
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objective as ability to hit a target, and they are among the best of such 
results. A study oi the reliability of bombing stores, also cited by Thorn¬ 
dike (833), reports a median reliability of .08. As Kemp and others have 
shown in the original studies (421:42-52), so many factors enter into the 
accuracy with which bombs are dropped that one cannot predict the* 
performance of a given bombardier from one mission to the next unless 
he flies with the same crew and the personnel factors are thereby kept 
constant; e\cn then, weather provides a vitally important but extraneous 
\ ariable. 

Output may also he judged somewhat more subjectively, by having 
experts evaluate the product as to quality. This is done by developing 
a s(ore sheet 011 which specific aspects of the wot k arc rated and the total 
scoie obtained by combining these ratings. This is a method commonlv 
used in evaluating school systems and in phase checks or performance 
tests lor aerial gunners, but it has not often been applied to civilian jobs. 
The work to be evaluated need not be tangible, but may instead be 
simply an observed performance as in the case of the standard flight 
checks developed for pilots in the Army Air Forces. In these flight checks 
the cadet performs certain highly standardized maneuvers, while the 
(heck pilot or examiner recoids such objectively determined items as 
the angle of bank in a steep turn, the time taken to complete it, and 
(hanges in altitude. These observed performances provide an objective 
basis for the performance scoie. Work along these lines did not progress 
far enough for complete evaluation before the end of the war, but one 
group of 16 selected items had a reliability of .39 for cadets with 15 hours 
of naming and .50 for men with 55 hours of flying (833:47). 

Ratings of performance provide a widely used type of criterion, prob¬ 
ably the most common because of the relative ease of obtaining them. 
The history of ratings has, however, been extremely disappointing, and 
when the y are relied upon today it should be only because of inability to 
find or devise a better criterion and after systematic steps have been 
taken to make them as reliable as possible. I he literature on rating as 
a technique is too well known to need reviewing here; it is well treated 
in vSymonds (810: Ch. 3), Strang (768: Ch. (i), and Traxler (860; Ch. 7). 
The recent work on the California Adolescent Growth Study (567), al¬ 
though not concerned with vocations, provides suggestions for further 
improving rating scales and their use. From the point of view of the 
reader of the liter.ituie on the validity of tests, the questions to be kept 
in mind have to do with the extent to which the latings of one judge 
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agree with those of another, the possible influence of halo effect (the 
tendency to rate specific traits on the basis of an overall evaluation), and 
the relevance of the traits or heha\ior actually being measured to the 
work in question. Jn one study (S;;;; ; r,o-r, i) in which airplane com¬ 
manders weie rated while going thiougli operational (combat) training, 
the rating for “likeableness” had the highest correlation of any of the ten 
traits rated with the overall rating of suitability for combat Using. There 
would seem to be little relevance in this case, and considei able halo effect. 

In studies of the use of tests in vocational counseling conducted in 
England under the auspices of the National Institute* lot Industrial 
Psychology (1 i ,2 i }:.v$8<),.]oi), and in a lew American investigations 
(lb.pyob) ratings of vocational adjustment have been used as a ciiterion. 
In these instances the investigator usually makes a case* study oi the in¬ 
dividual in his work and gives him a rating for vocational adjustment 
according to the extent to which he seems to be propelh placed, satisfied 
with his work, and satisfactory to his employer. Little attention has as vet 
been paid to the adequacy of the judgments made* bv such invesugatois, 
presumably because of the* labor involved in having more than one 
judge go over the necessary case material. In nianv respects, howevet, 
this would appear to be an ultimate criterion of so desirable a tvpe as to 
justify giving time to devising more* economical wavs of using it and 
more thorough stuck of its reliability. 

Most users of ratings have obtained ratings oi the traits 01 behavior 
of individuals. In a few investigations the* focus has been not on a pci son, 
but on some tangible product of that person's work. When this has been 
the case the lesults aie somewhat more encouraging. One of the best 
examples is the Minnesota Mechanical Abilities Pioject (588:201). in 
which inclustiial aits teachers latccl the* shop pioducts of junior high 
school boys loi quality oi wotkmanship. In such rating the identity of the 
worker can be disguised to avoid halo eflect, thereby locussing attention 
on the specific aspects of craftsmanship to be judged. I lu* leliability of 
the ratings in this study was .7b in the woodshop and .72 in the* sheet- 
metal shop. The principal weakness in such criteria, as in the case of 
more objective output criteria, is the neglect of important human factors 
not directly revealed in the product of the worker. 

Self-ratings have occasionally been used as a criterion of success in at¬ 
tempts to get at the less tangible* and more personal aspects of vocational 
adjustment (^77.1)07,71)0). The focus in these investigations has generally 
been on the nature and extent of job satisfaction rathei than on the 
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predictive value of tests, although Sarbin and Anderson (667) did study 
the relationship between Strong’s Vocational Interest blank and satis¬ 
faction in work. In studying the value of tests in vocational selection, 
the emphasis is appropriately on the effectiveness of the worker in per- 
(orming his task as indicated by ratings of supervisors or by output, but 
as the use and study of vocational tests in counseling is improved it is 
probable that more attention will be paid to ratings based on case studies 
and to sell-ratings, the former as an index of overall \ocational adjust¬ 
ment, and the latter as a criterion of the worker’s feelings of success and 
satisladion in his work. As self-ratings of job satisfaction such as arc 
pro\idcd by Hoppock’s scale and the occupational adjustment key of 
the lid 1 Adjustment inventoiy are further refined, to distinguish between 
job and ot (upational satisjeu lion and between the various components of 
each of these global concepts, the) will probably find increasing use in 
the validation of tests and inventories for vocational guidance. 

Administrative acts which provide criteria of vocational success in¬ 
clude' the obtaining of employment in a given field, promotion, increase 
in pay, discharge or failuie, and other tangible evidence that people em- 
ploved in the- field consider the individual in question a success or failuie. 
These administrative acts have many of the drawbacks of ratings, and 
aie in fact administrative outcomes of ratings; 011 the other hand, thev 
aie generally made after more serious deliberation than a rating is, be- 
<ause ol the obviousness and immediacy of their effects on employer as 
well as on employee. Abilitv to complete flying training was thus the 
best immediate criterion ol success in the Aviation Psychology Program 
of the Armv Air Forces; promotions, decorations, assignment to first or 
co-pilot duties, assignment to lead crews, removal from flying status for 
Hying mors, and removal from combat because of operational fatigue 
(nemotic reactions to combat stress), were also used as intermediate and 
ultimate criteria (833:55). The National Institute for Industrial Psy¬ 
chology lias frequently used ability to keep a job as a criterion (11,232); 
in a period of depression, when jobs arc scarce and promotions come 
slowly, this is presumably a sound criterion, but in more prosperous 
times, when transfers to better jobs are more easily obtained, and when 
the scarcity of labor makes employers retain marginal and submarginal 
employees, the criterion is obviously less adequate. This illustrates the 
defect inherent in all administrative criteria, that is, the degree to which 
they are affected by external factors. Abilitv to complete a training se¬ 
quence may depend in pan upon changes in standards from one time to 
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another or from school to school: at one time, for example, one primary 
Hying school consistently eliminated 50 percent of its students and an¬ 
other only 10 percent, despite control of the quality of the cadets sent to 
them for experimental purposes without their knowledge (‘316:116). In 
the last analysis, administrative acts make a good criterion because it is 
in terms of them that success and failure arc judged in daily life; at the 
same time, it is important for the user of tests based on such indices to 
know just what factors were operating in the administrative situation 
at the time in question, and the effect of their presence on the criterion 
and on the test validities. 

Internal consistency (see page ('>52 for definition) is frequently used as an 
index of the validity of a test, although it has no necessary significante 
for vocational prediction. In the case of general intelligence, the voca¬ 
tional significance of which has been demonstrated in numerous studies 
with a variety of tests and for the measurement of which ceitain types ol 
items have amply been demonstrated to be effective, it may be sufficient 
to check the internal consistency of a new test and to standardize it on 
a good sample population for its results to be useful in vocational guid¬ 
ance. Ascertaining its validity for specific occupations would be helplul 
to counselors, but might be dispensed with if it interfered with bettn 
validation of other tests. On the other hand, measures of special apti¬ 
tudes, of interest, and of personality arc still so little understood, and 
the nature and operation of these charctcristics in determining voca¬ 
tional success and satisfaction is so uncertain, that merely knowing that 
the items in a test measure the same thing is insufficient. The score on a 
test should be a measure of one characteristic rather than of several un¬ 
related traits or abilities, and the people who score high on one half ol 
the test should score high on the other half, in order that one may be suie 
the test is measuring something and measuring it well; but the \ocational 
counselor, psychologist, and personnel man need to know that what is 
being measured is related to success in the activity or activities in ques¬ 
tion. This requires an external criterion of validity such as those dis¬ 
cussed in the earlier paragraphs of this section. 

Knowing the various types of criteria discussed above, and their advan¬ 
tages and limitations, the test constructor canvasses the situation in 
which he is working to ascertain what kinds of criteria are already avail¬ 
able to him, and which could be made available if proper steps were 
taken. Existing criterion data are analyzed in order to ascertain their 
reliability. Supervisors who already rate their employees may be given 
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;i lcfrcshcr course in rating in order to make their results more reliable, 
or statistical corrections may be made for constant biases which the data 
have revealed in certain raters. Production records may be usable in their 
present form, or it may be found that there is too little variation among 
workers for them to serve as success criteria. If no suitable criterion al- 
leady exists, the psychologist must decide which possible criterion lends 
itself most effectively to use in that situation and how data might be 
collected. He may need to use a second-best criterion, because the data 
ate more readily gathered than those needed for the best possible index. 
In any case, it is important that the criterion chosen be not only obtain¬ 
able and reliable, but also appropriate to the test or tests being vali¬ 
dated: televance should not be sacrificed to convenience or to objcc- 
tivitv. These decisions tentatively made, the next step is the building of 
apparatus or the writing of items. 

Test Construe lion 

Once the nature of the characteristic to be tested and of the criterion 
to be used in validating the test have been decided upon, the choice of 
type of test and ol um item is relatively easy. If the characteristic to he 
tested has been isolated by job analysis procedures it may he a relati\el\ 
complex bit oi behavior requiring a miniature situation test and then- 
lore, as a tide, appaiatus. Or the chaiacteristic may have been broken 
down into relatively abstract components which lend themselves to pa¬ 
per and pencil testing: thus in aviation cadet testing a large fraction of 
the validity of certain apparatus tests lay in their measurement of spatial 
visualization, a factor which was well tested bv paper and pencil tests 
used in the same batten (315.31G.31p). Knowledge of the literature of 
aptitude and personality testing is also a source of ideas as to how to 
attempt to measure a given trait. 

'Hie type of test having been decided upon, the next step is to construe t 
the apparatus or to dune or write items. In the case of an apparatus test 
first a sketch and then a rough pilot model is made in order to device 
suitable mechanical or electrical methods, to ascertain the most effective 
size* or sizes for the various parts, and to have a model for use in experi¬ 
mental trials. I11 paper and pencil test construction the procedure is to 
draw up an outline of the proposed contents of the test or inventory, 
write, photograph, or draw items of those types, and refine them by check¬ 
ing and rechecking. Thus in constructing a three-dimensional test of spa¬ 
tial relations one would cut blocks of wood of various sizes with various 
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degrees of complexity, in order 10 ascertain which yield the best results; 
in the case of a general information test one canvasses encyclopedias, 
current magazines and newspapers in order to choose topics lor items, and 
makes up questions with suitable right and wrong answers. 

The preliminary lot m or lorms of the test having been prepared, the 
test is tried out on a small group ol subjects, who may be a sophisticated 
group of co-workers or a sample ol the type of subjects lot whom the test 
is designed. Ideally, both are done in older to get subjccthe comments 
and criticisms from the points of view ol both test constitutors and 
persons like those to be tested. In one project, lor example, the authoi 
helped devise a personality inventory for aviation cadets. The topics 
covered had been selected by the test construction staff, items had been 
suggested bv cadets in free response answers to somewhat genetal ques¬ 
tions about thdr satisfactions and complaints in the Armv. and questions 
had been framed and multiple-choice answers put in tentatiu* form by 
the test const! uctors. This preliminary form was then administered to 
a small group of aviation psychologists who had not woiked on it and 
to se\eial small groups of cadets, who were ashed to raise am questions 
they wanted to and to etitiei/e the items. Objectionable* wot els ot plnascs 
were pointed out, a few umealistic answets were* nitici/ed, and better 
substitutes were found. 

Further revision of the test results born the abo\e ptoceduies, and the 
test is reproduced lor the eoJleetmn of data on a larger scale*. I he actual 
number \aries with the facilities lor ttial testing, but is normalh large* 
enough to make* possible the establishment ol time limits, the* dux king 
of the clarity and completeness ol directions, the locating ol ambiguous 
or offensive items, and the analysis ol the* internal consistency of the* test. 
The subjects at this stage* should be a sample of those* lor whom the test 
is designed, not only because different t\pc*s of groups mav requite 
different amounts of time or need directions which go into varying 
amounts of detail, but also because items that work well with one type* 
of subject may not work well with another: for example, a question may 
be well-phrased and have a right answer for unsophisticated subjects, but 
may be unanswerable by more sophisticated examinees because of over¬ 
simplification of matters which they know to be complex. 

An analysis of the infernal (onsisteney of some tests is not possible at 
this stage, either because some apparatus tests with time scores have rro 
items or parts, or because the test may not be scorable until it has been 
item-validated. If, as is generally the* case with aptitude tests, there is 
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an a prior/ method of scoring based on right and wrong answers, this 
scoiing key needs to be analyzed to make sure that answers keyed as 
“light” are in fact generally chosen by those who make high total scores, 
and that the wrong answers are mote frequently chosen by those whose 
total stoics are low. The test is then revised again, in order to eliminate 
pool items and sharpen those that are ambiguous, alter which it should 
be icady lor large scale administration. 


I he piimipal pioblem in administering vocational tests for standaid- 
i/ation and validation is irliow to test, and at what stage of their caiecrs. 
The* finest ion ol June many is mote easily answered at least in theoi \. 
Whether the test is to be used in guidance or in selection (in whieh this 
widen ine Judes placement and piomotion unless otherwise specified), it is 
ol)\ious that it should be standardized on persons lor t\hom the chosen 
criterion or rritetia ol success ate or ’will be available. Hut this laises a 
problem) which has plagued ps\c hologists since the beginnings ol apti¬ 
tude- testing, lor il the- tc-st is standai dized on a group who are ah e ach 
cmplovcd in the- occupation, and lot whom criterion data ate presumabh 
i cad i l\ pi oc in able, thei e- w ill be a teal quest ion as to the \ alue ol the test 
when used with pel sons who have- not vet enteied the- held, Specbic alh. 
will a low sc oi c- made- bv a high school 01 college student indicate a tela- 
ti\e- lack ol the aptitude mcasuied. or will it rellec t primarily what is 
aheaeh known, nameh, his lack ol ttaining and experience in the field 
in question? 11. on the- othci hand, the test is administered to students or 
othcis who ha\e- not \e-t entered the- field in question, how is one to \ali- 
date- it; i he- lag In tween testing time and that at which criteria ol success 
be e ome a\ ai kibie mav 1 a* c onsidei able-, and the loss ol c ases tin ough enti y 
into othe r lie-ids not being investigated and through change ol address is 
ce rtain to be almost pi ohibiti\e. 

I .ongi t uclin.d validation studies ol the- tvpe just mentioned ate iaie. 
Strong's studie s ol his Yen ational Inteiest Blank have generallv emploved 
the ev post fiu to \alid.ition of differentiation betwee-n ])e*o])Ie etiiplow-c: 
in \ at ions occupations (77;,: Cli. 7). but he has also administered his in- 
ventoiv to misee 1 laneous colle-ge students and followed them up about 
ten years Intel (77",: (dh. i(>) in order to ascertain the relationships be¬ 
tween their test scoies on the one- hand and entry into and stabilitv in 
various occupations on the- other. Longitudinal validation has been used 
more in selection piograms, especially those involving training alter \ve- 
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liminary selection. T he Armed Forces frequently selected on the basis 
of tests which were first validated by giving them as though for use in 
selection and then checking their results against success in training; 
schools of nursing, medicine, engineering, and other professions do like¬ 
wise, although in these cases there is no guarantee of employment if 
training is completed. As users of tests in personnel work become more 
test sophisticated, as users of tests in guidance become more exacting in 
their requirements, and as constructors of tests raise their standards 
through familiarity with good practices, longitudinal \aliditv studies 
should become more numerous. 

In the meantime cross-sectional validation studies are the commonly 
available tv pc. Strong first validated his inventory b) contrasting the 
answers of men in one field with those of men in other fields: Ruder is 
now doing the same with his, although the first validation was by internal 
consistent} (802); the numerous sets of norms compiled by the Minnesota 
Employment Stabilization Research Institute compare workers in one 
field with those in others or with the general population (589); the ma¬ 
terial comprising the bulk of this book deals with group differences and 
relationship to success in training, rather than with success in an ore upa- 
tmn, because of this emphasis in the lesearch. It may be well to point 
out, however, that the result may not be as disastrous for vocational 
counseling as one might suppose, for work by Strong and by Carter (145), 
the most complete along these lines, shows that the results oi some ex 
post facto validated tests can legitimately be applied to untrained and 
inexperienced persons if one knows what corrections to make lor matu¬ 
ration. This finding for Strong’s Blank has been confirmed in other ways 
with other tests, for example, by determining the effects of training and 
of age on the Minnesota Clerical Test (see Ch. 8). 

The number of cases to be obtained, it has been stated, is more readily 
decided upon than whom to test and when to test them, but is inextri¬ 
cably involved in both of these. The determining factors are the number 
needed in order to compute certain statistics and the number that can. 
in a given practical situation, be tested. If the test is being standardized 
for selection in a department which employs 200 workers in one job and 
hires fifty new people each year, and if lesults are to be available for use 
in a reasonable period of time, it is clear that pre-selection testing and 
validation cannot be based on more than 50 or 100 cases, and that valida¬ 
tion upon persons already working is not likely to be feasible with more 
than 250 or 300 cases. As these numbers are large enough for computing 
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correlation coefficients and critical ratios, test construction and validation 
may well be worth while in this situation. Certainly the sample would be 
adequate if the test is to be used only to select for that job in that concern 
providing labor market and job remain the same, as it includes the whole 
universe in question rather than just a sample. 

If the test is to be standardized for counseling in connection with the 
choice of an occupation the problems of numbers and sampling become 
much more acute. While it is relatively easy to make sure that a job in 
one factory is in fact one job rather than a number of different jobs, 
making sure that the persons who are nominally engaged in a given 
occupation are in reality doing the same type of work is almost impos¬ 
sible, lor if they are to be a good sample they must be distributed through¬ 
out the country and analysis of their work is likely to be impossible. The 
lest constructor has then to content himself with other devices which may 
help him select a well-defined and homogeneous group. He may, like 1 
Paterson and his associates (588) confine his study to a thoroughly studied 
and well-defined group of boys in one junior high school in one commun¬ 
ity; he may follow their lead in a series of other studies (589), and select 
a cross section of the employed population of one city which is distrib¬ 
uted among the major occupations in the same manner as the employed 
population of the United States as a whole. Both groups may then num¬ 
ber only in the hundreds, being well selected. But in the former case, the 
counselor must assume that success in mechanical activities will be judged 
in the same way in his school or community as in Paterson’s, and that the 
same psychological and social factors opeiate in his subjects in approxi¬ 
mately the same wav, or he must refuse to use the test without a local 
validation study of his own. In the latter case he must assume that stenog¬ 
raphers and typists in Minneapolis do the same t\pes of work, requiring 
the same 1 types and degrees of aptitudes and skills, as the stenographers 
in his own community, and similarly witl 1 retail salesmen, garage me¬ 
chanics, policemen, etc., or he must refrain from using the tests until lie 
has gathered his own norms and his own validation data. The assumption 
may be quite sound in some instances and quite unsound in others; the 
writer suspects that it may be true for bank tellers, but false for retail 
sales clerks. Observational evidence for the latter assumption lies in the 
differences between standards for clerks in dime stores and in more 
expensive establishments, which govern both the referral of girls to such 
stores bv placement workers and their selection by employment man 
agers. 
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The solutions to the sampling problem used by Strong, who faced it 
repeatedly and was primarily interested in the counseling values of his 
test, followed no uniform pattern and illustrates the opportunism which 
problems of time, money, and co-operation have forced upon test con 
stiuction workers. The psychologists on whom Strong standardized his 
psychologist key constituted more than one third of the lull members of 
the American Psychological Association at the time of standardization 
(some 200), and were scatteied throughout the country, ha\ing been 
reached through the membership list ol the Association. This would seem 
to be a good sample of academic psychologists, although it may have 
slighted applied psccliologists, some of whom were not members ol the 
Association. On the other hand, the group upon whom the key for social 
science teacher was standatdized consisted of mote than 200 teachers 
employed in the state of Minnesota. The\ may have been a good sample 
of such tcachets in that stale, but there is no way of knowing whether 
they weie also txpical of social science teachers in New Hampshue with 
its rathei dillerent population, in Georgia with its dilfcient euliuie and 
salary standaids, or in othei states and localities. Obviously, the counselor 
using such a test needs to know the that actcristics ol the population on 
which it was validated, and the extent to which the latter resembles the* 
population with which he is woiking, befoic he can draw am legitimate 
conclusions from its scores, it is, therefore, important lor the test con¬ 
structor to choose his validation grouj) well and to describe it in detail. 

1'ah elation 

The terms standardization and validation ha \e been used synony¬ 
mously in the preceding section, because the standatdi/ation ol a voca¬ 
tional test implies collecting data which make* possible validation. If the 
test is administered to persons with whom its use* is appropriate, norms 
are gathered, and its significance ascertained, much of the pine css ol 
validation is already accomplished in standardization. In the sense in 
which the term is used in the secpiencc of steps outlined here, \alidation 
is therefore the statistical procedure of analyzing lest results in relation 
to criterion data (see page (L31 for definition). In work with some types 
of tests this process consists of just one step, the determination ol the 
relationship of test scores to the criterion; in work with other types of 
tests, however, it involves another step betore scores can be \alidatecl, 
specifically, the validation ol each item in the test. 

Item validation , the determination ol the extent to which a given 
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question is answered one way by the “success” and otlici ways by the 
“failure” group, impresses the novice as a laborious procedure. It is this, 
but it frequently proves its worth and is often indispensable to test con¬ 
struction. For example, the writer and three associates developed a per¬ 
sonality inventory, teferied to previously, for use in aviation cadet 
selection and classification. The items had no inherently right or wiong 
answers, as they dealt with satisfaction and dissatisfactions in such things 
as drill, strafing ground troops, bombing towns and cities, and being an 
officer, but the item writeis naturally had hypotheses concerning the 
psychological soundness of the attitudes cxpiessed, and of the possible 
significance o{ these reactions for success in Hying naming. One ol the 
collaborator s (John I.. Wallen) constructed two a Jmnu keys lor thi> 
inventory, one of them intended as a measure* of morale, the other ol 
atypicality of attitudes and beha\ior. The former was stiictly a \mo,i , 
but the other contributors to the inventory (Robert R. Blake and Joseph 
Weil/) agreed that the responses scored as indices of pool morale would, 
in fact, be considered sunptoms ol poor mot ale by most competent 
judges. I he at\picalitv key was more objec tive, in that it w,^ empiric alls 
derived: all responses chosen by small percentages of the cadets m the 
standardization group before the predictive value of the test was known 
were weighted in tire atypicality key. One of the collaborators (Blake), 
while agreeing with the logic ol each step in the construction ol these* 
keys, was convinced that they would not Jiavc am validity lor success 
in aircrew training; the others, though pragmatic in their attitudes, 
thought they might prove valid. When criterion data in the lonn ol 
graduation-elimination reports arrived from primary flying schools and 
scores on the two a priori keys were validated against them, the scoring 
keys were found to have validities of approximately zero (801). The next 
step was therefore to validate each item against pass-1.111 in living train 
ing; when this was done, quite a number ol items were iound to be 
answered predominantly in one way by the successes and in other way* 
by the failures. A new, empirical key was thereloie made and cioss-vaii- 
dated on another part of the sample not used in the item validation: :t 
proved to have a validity of about .20 (significant at the* 1 percent lev elk 
While this was not very high, the test was unique enough loi its contribu¬ 
tion to the cadet classification battery to raise the latter's validity lroni 
about .66 to about .69, an improvement easily worth twenty minutes of 
testing time and a moment ol scoring (31b:736-7 !<>)- 

7 his example brings out deal Iv the* importance of item validation in 
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tests and inventories which have no inherently right or wrong answers, 
lor even the best logic often fails in constructing vocational tests. Even 
when a test has right and wrong answers, however, the right answer is 
not necessarily the best for persons in a given occupation. If, for example, 
being well informed on the hobby of philately were characteristic of men 
who succeed in pilot training, being able to select the correct definition 
ol the term “wove” fiom among lour false definitions would be a “right” 
answer for potential pilots; but, if knowing about stamps and stamp 
collecting were characteristic of men who fail in pilot training, the correct 
definition of the term would be a “wrong” answer for pilots. If the latter 
weie the case (the example is fictitious) a test of philatelic knowledge 
might be validated as a test, without item validation; but one would need 
to be certain that it was philatelic knowledge as such that was prognostic 
ol failure, and not just knowledge of certain aspects of stamp collecting 
Midi as the technicalities of paper-making, colors, and perforations, as 
(ontiasted with the historical and geographic knowledge which a carctul 
stamp collector also acquires. Hence the usefulness of item analysis, as 
described by Dads (iy(i). This problem does not arise when the test is ol 
a clearly homogeneous type, lor example, a spatial visualization test 
utilizing two-dimensional forms in each item, for in their case both 
logical analysis and internal consistency indices demonstrate the fact that 
what is measured by one item is also mcasuted by other items. 

The validation of stores is generally done by correlating the score 
made on the test with the criterion data. Thus the validation of a test of 
ability to judge spatial relations for military pilots involved comput¬ 
ing biserial con elation coefficients for test scores and pass-fail reports of 
cadets who entered primary school after taking the test, and the valida¬ 
tion of Strong’s life insurance salesman’s key for success in selling life- 
insurance involved the correlation of dollar volume of sales with test 
scores, using the product moment method (772). In many cases other 
methods are used, the principal reliance in Strong’s insurance study, for 
example, being placed on the analysis of the percentages of men with a 
given letter grade on the interest inventory selling a given amount oi 
insurance (enough to make a living as a salesman). This method is known 
as the percent of overlapping technique using a cut-off score, and differs 
only superficially from a third, in which group differences are expressed 
by means of a critical ratio. These are standard techniques described in 
detail in elementary texts on statistics. 
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The choice of method is dictated by the lorm in which the data are 
expressed: reports concerning having passed or failed a course cannot 
be used in computing Pearsonian correlation coefhcients, but do lend 
themselves to the use of biserial r’s. Data like Strong’s lent themselves to 
either correlation or percent-of-overlap analysis; although he used both 
procedures, more emphasis is placed on the latter technique, because the 
nature of interest scores makes letter grades more meaningful than 
standard scores (775:67) and because the fact of earning or not earning 
enough money to live on seems more important in judging success as a 
salesman than difierences above or below that amount. 

Cross- Validation 

It has long been an accepted principle ol test construction that a test 
should be not only validated, but cross-validated, that is, administered 
to another comparable group and scored in the same way, to ascertain 
whether the validity for the second group is as high as for the first. This 
need was brought out by the fact that validities in subsequent studies 
were often lower than those in the original study of a test, as a result of 
special factors present in the criterion group which are not present in 
the cross-validation groups. These factors operate especially in small 
samples in which, for example, a disproportionate number of members 
ma\, as a result ol pure chance or ol administrative bias, come from one* 
part of the countrv, be younger than the occupational universe horn 
which they are drawn, or have some other things in common which are 
not so common in other samples of the same occupational group. 

A good illustration of the operation of this t\pe of regression toward 
the mean is found in the author’s study of a\ocational interests (791:60). 
in which scoring keys for the hobbies of model-train building, instrumen¬ 
tal music, photography, and stamp collecting were lound to regress from 
mean standard scores of 50 for the criterion groups to means of 36, 2.j, 
33, and 26 for the respective cross-validation groups. W hen expressed in 
terms of group differentiation, these results meant that, although the 
scoring keys differentiated quite w r ell between the criterion groups on 
which they w’ere based, they failed to differentiate other similar hobbv 
groups in the case of the philatelic key, failed for all practical purposes 
in the case of photography, and differentiated somewhat in the case of 
model engineers and fairly well only in the case of amateur musicians. 
Strong (775:637 fi.) studied this problem, and found that, although 
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groups could sometimes he differentiated with as lew as 50 or 100 cases 
in the criterion group, be tter differentiation was obtained, with minimal 
regression toward the mean in cross-validation, when criterion groups 
of from 250 to 500 are involved. 

Although the need for cross-validation has been recognized in the 
literature it has in fact too olten been honored in its breach because of 
practical reasons such as time, money, and the difficulty of obtaining 
co-operation from sufficiently large groups. Some dramatic instances of 
reversed relationships in cross-validation are reproduced from Stead and 
Shartle (750) in Figures .{ and r t (pp. 169 and 170). 

The experiences ol psvchologists in World War IT have again driven 
home the fact that cross-\aIidation is essential, despite Strong’s conclu¬ 
sion (775:r>[) () ) that, when a large criterion or original validation group 
is used (and additional cases arc difficult to obtain), cross-validation may 
be dispensed with. Kxpciience repeatedly showed that a test validated on 
several bundled aviation cadets might appear valid until evidence was 
obtained on anothci sample, at which time it would lease all semblance of 
validity. In one study the Roischach l\svchodiagnostic was administetecl 
to cadets, and ratings of their probability of success in training were made 
by trained cxamineis who were also somewhat familiar with the tecjuiie- 
ments of living training I11 the validation or criterion 

group consisting of e very other tested cadet (\ —2fvp the biseiial 1 with 
pass-fail was .23. the standaid enor being .09. When the ctoss validation 
was completed on the other half of the tested group the correlation lei 1 
to approximately zero. The oiiginal figure was not verv high, it is true, 
but with a battery of tests which occupied one and one-half days of the 
cadet’s time and had a validity of .(>(*>, each research test which had a 
validity of .20 and a low correlation with the tests actually in use was 
carefully scrutini/cd as a potential contributor to the* battery, and a 
number of such were found which repeated!) yielded validity coefficients 
of about the same size and added .03 or .0 j to the validity ol the battery. 

The teclinicjues of cross-validation ate the same as those of validation 
with the original group. Sometimes they are applied after a second round 
of testing and the collection of new cases, but more commonly it is found 
mote practical to gather enough data at first to carry out both procedures, 
doing the validation on even-numbered cases, for example, and the cross- 
v alidation on odd-numbered cases. This insures controlling the effect of 
tlie times at which data are obtained, and vet provides two groups for 
study. 
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Factor Analysis and Factor Validation 

A further step in test construction and validation has been added by 
Guilford (3if>: Ch. 28; 317,319), through the application of (actor analysis 
to test construction in personnel selection. Briefly, it consists of analyzing 
tests in order to ascertain their factorial composition, and ol analyzing 
the criterion in order to determine the nature and weight of the factors 
which enter into it. The former step makes possible the refinement of 
tests, to make them factorial!) pure; this has the advantage of cutting 
down the number of tests needed to predict success, by eliminating over¬ 
lapping ol tests and making each test do a maximum of woik. The latter 
step, analysis of the criterion, indicates what types of tests should be 
stressed in order to improve predictions. Illustrations of each ol these 
procedures follow, again taken from aviation psychology because the 
most extensive applications to date were made in the Army Air Forces. 

Factorial Analysis of Tests. The use of factor analysis implies that 
tests can be statistically analyzed into a limited number of under lying- 
traits or aptitudes, or, conversely, that existing tests actually measure a 
number of traits which can be isolated by statistical analysis. To attempt 
to describe the procedures of factor analysis would be out-ol-place in this 
text, but some understanding of the significance of factor analvsis for 
test construction and validation is in order. The application of the 
Thurstone centroid method of factor analvsis with rotation of axes (S39) 
to a battery of tests results in the isolation ol three tvpcs of variances or 
components: 1) several common factors, that is, components which appear 
in several tests; 2) possible specific factors, appearing in only one test; 
and 3) error variance, arising from the unreliability of the measures. 
'These common factors, having been arrived at by a process which is 
largely mathematical, may or may not make psychological sense; it is by 
rotating the axes that meaningful factors are made to emerge. This is a 
somewhat subjective procedure, calling for judgments on the part of the 
statistician. Even more subjective is the naming of the factors that have 
been isolated; this is clone by inspection ol the kinds of tests which are 
saturated or loaded with a given factor, to ascertain what the common 
elements seem to involve. 

Guilford (317) provides an illustration of how factor analysis can help 
one better to understand what tests are measuring, graphically presented 
in Figure 2. 

This figure shows the proportions of factor variances in three of the 
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Illustrating the complexity ol simple tests and the unknown quanti 
ties in miiiiatuie situation tests. After Ciulfoid (A 17). 

tests used in the AAF Aviation Psychology Program. These tests were 
developed in the standaid ways already described. That is, it was thoughl 
that reading comprehension might pla\ some pan in flying success, so a 
Reading Comprehension Test was dexeloped with aviation types of 
materials. Pilots, navigators, and bombardiers make much use of books 
of tables and take many readings from dials. A Dial and Fable Reading 
Test was therefore developed, using dials such as those in airplanes and 
tables such as arc used in navigation. Reaction time is frequently men¬ 
tioned by pilots as an important characteristic in flying, quick response 
to a variety of stimuli being obviously important in taking oil, binding, 
and in many emergencies; hence a Discrimination Reaction Time Test 
was constructed, along lines long used in laboratory studies in physiologi¬ 
cal psychology. 

The job analysis procedures used in developing these tests were 
obviously those of observation and deduction. The tests were, in the cases 
of Reading Comprehension and Discrimination Reaction Lime, attempts 
to measure more or less unitary traits, and, in the case of Dial and Table 
Reading, an attempt to duplicate the job situation in miniature. 

As Figure 2 brings out, all three tests were complex in their factorial 
composition. This was true not only of the miniature situation test. 
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which might have been expected to draw on a variety of abilities, but 
also o£ the two tests which are normally thought ol as being simple in 
their composition. The reading test draws on the following abilities: 
verbal comprehension, mechanical experience (some of the content was 
mechanical), general reasoning, analogic reasoning, visualization, and 
several much less important factors. The discrimination reaction time test 
requires ability to judge spatial relations, psychomotor precision, per¬ 
ceptual speed, visualization, numerical ability, several minor factors, and 
a relatively large number of unknown factors, one of which might be 
reaction time. The dial and table test measures six major factors (number, 
spatial relations, perceptual speed, general reasoning, mathematical ex¬ 
perience, and psychomotor precision), a few minor factors, and some 
unknown factors. Such unknown factors, if not specific to the test, emerge 
because the test battery does not include enough other tests for them to be 
clearly recognizable. 

'These three tests were found to measure, not three traits, but a total 
ol ele\en. Of these, six are measuied by more than one test. This is clear!v 
not economical, as one good measure ol a given factor would be less time 
consuming than three tests. It is also inelhcient from the point of \icw ol 
prediction, as the validity of one test may be due to one of the lactois it 
measures, whereas others that it also taps may actually tend to lower its 
validity, as when they correlate negati\ely with the criterion. In such a 
case positively and negatively significant factors tend to countciact and 
cancel each other in the same test. 

The contribution of factor analysis to test construction is, therefore, 
to make possible the refinement and purification of tests, and to reveal 
what kinds of tests may actually be developed. The three tests just de¬ 
scribed yielded ideas for eleven different tests, some of which might lie 
positively significant for pilot selection, some negatively, and some not 
at all. 7 he const! uction of eleven separate tests makes possible the diflci- 
cntial measurement of these eleven traits, and improves predictions based 
on the validities of these traits. Profiles showing the scores on independ¬ 
ent traits such as these are much more useful in counseling a person, 
provided the validity of the traits measured is known. Guilford’s unique 
contribution lies in his having not only isolated the underlying factors 
of this extensive battery of tests, but in having ascertained the signifi¬ 
cance of these factors for success in several occupations. This latter topic 
is expanded in the next paragraphs. 

Factorial Analysis of the Criterion. When factor analysis is applied 
to the criterion of success, two major types of results are accomplished. 
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First, the occupational significance of the factors is made clear, permit¬ 
ting the counseling ol individuals on the basis of factor profiles or the 
weighting of factors rather than of tests in selection programs. As is 
pointed out in the discussion of the Primary Mental Abilities Tests 
(Ch. (i), the drawback of factoriallv pme tests has been the lack of evi¬ 
dence to guide the interpretation of their results. The second outcome is 
a better understanding of what it is one is trying to predict, that is, of 
the nature of success in the occupation in question. Factor analysis of 
the criterion gives one an objective description of what it is that is being 
predicted, to supplement the observational data of traditional job analy¬ 
sis and the deductions bom test validities. But the very nature of factor 
analysis imposes some limitations of a very serious nature on the second 
type of use of the technique with the criterion. As many writers have 
pointed out, one can extract from a factor analysis only that which is 
put into it. More concietelv, the only fattens which can be isolated are 
those which are tapped by mote than one test in a battery. If. thetefore, 
the battery of tests used in the analysis is limited in scope and fails to 
include some traits which might be measured (and all batteries are more 
or less open to this criticism), the analysis of the criterion will lea\e 
imdcsrribecl some of the abilities which it requires. An indication of 
the extent of these unmeasured components is, of course, provided by 
the unknown-factor variances. 
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Figure 3, also taken from Guilford (317), shows two criteria analyzed in 
the same way as the three tests already discussed. The pilot criterion was 
found to be composed of 27 common factors; about 52 percent of 
1 he variance of success or failure in pilot training could be accounted 
for by 23 of these factors. If other tests of appropriate but un¬ 
known types had been included in the battery, another 28 percent oi the 
\ariance could perhaps have been predicted, leaving 20 percent of the 
vaiiance in success or failure due to lack of reliability. Nine known fac¬ 
tors accounted lor about 5b percent of the navigator criterion; apparently 
success in navigation training was more easily predicted, and less complex 
in nature, than was success in flying training. 

It is interesting to note (hat success in pi lot and in navigator training 
have little in common, according to the data in Figure 3: only spatial 
relations and perceptual speed appear in both occupations. This is in 
contrast with the three tests lor which lactorial data were presented, 
and which overlapped more completely in their components despite 
superficial diflercnces and some unique factors. 

What these data make clear for vocational counseling is that number 
ability is not important in success in pilot training, and need not receive 
attention in profile interpretation; that mechanical experience, visual¬ 
ization, and ps)chomotor precision (among other abilities) differentiate 
pilots from navigators; and that navigators, on the other hand, are 
helped by the possession of number ability and mathematical background. 
These facts are brought out more clearly by factor analysis of the cri¬ 
terion than they could be, for example, by an analysis of the differential 
validities of impure tests such as that for reaction time or reading com- 
pi ehension. 

For improvement of pi edict ions of suc cess in pilot and navigator train¬ 
ing these data make clear the facts that there is still considerable room 
lor improvement in the test battery and for the development of tests 
measuring other factors. We have already seen that approximately 28 
percent of the variance in pilot success could still be predicted if suitable 
tests were available. The graph shows that there is probably less room 
for improvement in navigator selection. But just how this improvement is 
to be effected, or what types of traits should be tested, is not made clear. 
In order to get clues as to what these traits are, one must still depend on 
the traditional type of job analysis, whether for the selection of existing 
tests or for the devising of new instruments for inclusion in the battery 
and in the next factor analysis. 



CHAPTER II' 

THE NATURE OF APTITUDES AND 
APTITUDE TESTS 


Definitions 

THE term “aptitude” is generally used loosely both by laymen and by 
vocational psychologists and counselors. Its meaning \aries not merely 
from one user to another, but even from one time to the next in the 
speaking or writing of a given psychologist or educator. It is used in 
either of two ways, as when we say that a man has a great deal of aptitude 
for art, meaning that he has in a high degree many of the characteristics 
which make for success in artistic activities, or when we say that a person 
lacks spatial aptitude, meaning that he lacks this one specialized aptitude 
which is of varying importance in a number of different occupations. In 
the former instance the woid is used not to denote a unitary trait, nor 
even an entity of any sort, but rather a combination of traits and abilities 
which result in a person’s being qualified for some t\pe of occupation 
or activity. In the latter case the t\ord “aptitude” is intended to convey the 
idea of a disciete, unitary characteristic which is important, in varying 
degrees, in a variety of occupations and a<ti\ities. 

These two different meanings June been attached to the term as a 
result of the tendency of psychology to use existing woids which already 
have popular meanings, redefining them in the process for the sake of 
clear thinking, instead of coining new terms of Latin or Cheek origin as 
is done in fields such as biology and physics. Both the popular concept of 
aptitude for a vocation and the scientific concept of aptitude important 
in xjocations are essential; it is important, however, that the meaning in¬ 
tended be clear. In general, counselors and personnel men tend to think 
in terms of vocations and jobs, and therefore to use the term in the broad 
popular sense, while psychologists tend to think in terms of individual 
differences and traits, and therefore to use the term in the narrow scien¬ 
tific sense. As most of the literature on tests is written by psychologists, 
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and most of the tests were constructed by psychologists, the counselor or 
personnel man needs to develop the habit of starting with the narrow 
scientific meaning of the term and of translating the psychological trait 
or characteristic into broader vocational terms. Similarly the psychologist, 
if his report of test results is to be meaningful and useful to the counselor, 
social worker, personnel man, or teacher, must be able to translate trait 
data into vocations. 

Various combinations oi traits and abilities may make for success in a 
given field. One teacher, for example, may be successful because of schol¬ 
arly ability, interest in his subject, and a desire to share it with others 
which result in a clarity of presentation, a wealth of material, and a 
warmth of manner which more than make up for a relative lack of inter¬ 
est in people as individuals and a dislike of the routines and details of 
classroom management. Another teacher may be equally successful be¬ 
cause of his genuine interest in students, his warm and friendly manner, 
arid his skill in classroom management, even though his scholarship and 
academic ability are mediocre. Similiar differences could be pointed out 
among successful lawyers, salesmen, foremen, assembly-line workers, and 
probably even machinists and draftsmen, although the facts are not so 
clear in the case of the skilled trades and lower technical occupations. 

Because of the varying combinations of special aptitudes and traits 
which make for success in a given occupation, it is desirable to continue 
the scientific use of the word aptitude in testing and test research. For this 
reason, the term will be used in its narrower sense in this book, except 
when expressly defined otherwise, as in the phrase “aptitude for the 
medical profession.” 

Even in its narrower scientific sense, however, the word aptitude is bv 
no means consistently and clearly used in the literature on tests. In 
Warren’s Du t ion my of Psychology (910) it is defined as a condition or 
set. of charat ternhes indicative of ability to learn. This implies that an 
aptitude is not necessarily an entity, but rather a constellation of entities; 
the set of cliarac teristics which enables one person to learn something 
may even be dil/erent from that which enables another person to learn 
the same thing; in this case, we arrive back at the popular definition. 
Bingham (yj:i(>-i8) uses approximately the same definition, further 
confusing the picture by adding a readiness to develop interest in using 
the ability. In some unpublished material Seashore and Van Dusen have 
attempted to define the term more rigidly, saying that an aptitude is a 
measure of the probable rate of learning, which results in interest and 
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satisfaction , and is relatively specific and narrow. The scientific study of 
an aptitude or of any other entity requires that one be able to name it 
(whether meaningfully or by means of a symbol such as x), describe it, 
and locate it in a variety of individuals and situations. I his means that 
it must be relatively constant in its nature and composition. Waiien s 
and Bingham’s definitions are therefore useless to a scientist ot to a 
counselor, while that proposed by Seashore* and Van Dusen is mote use¬ 
ful in that it prescribes narrowness and specificity. Accordingly, a scien¬ 
tific definition of aptitude would pro\ide for specificity, unitary composi¬ 
tion, and the facilitation of learning ot some activity or type of activity. 

In practice, the requirement of unitary nature is frequently minimized. 
The Minnesota Vocational lest for Vicinal Woikeis or number- and 
name-checking test is, for example, a test of about as simple an entity as 
one could expect to find, and yet factor analysis shows that the names 
test includes not jus t a speed and accuracy of discrimination factor identi¬ 
cal with that in the numbers test, but also an intelligence factor not 
found to any appreciable degiee in the numbers test (21). The Bennett 
Mechanical Comprehension lest, and othets like it, are generally as¬ 
sumed to measure a special aptitude, and yet the best available evidence 
suggests that mechanical information and ability to \isuuli/e space re¬ 
lations play major parts in it (see below, p. 221). In our piesent state of 
knowledge and with the current refinement of our techniques, it seems 
wiser to be satisfied if the aptitudes measured are ielati\ely distinct and 
have some validity, than to devote too much time to obtaining pure 
traits. The quick success of this global appioach in Billet's work with 
intelligence tests, discussed in Pintner (bop Ch. 2), has been borne out 
in aptitude studies such as the Minnesota Mechanical Abilities Project 
(588) and in interest research such as Strong’s (775), and even mote ic- 
cently in the slow rate of progress which has characterized the pure 
trait approach as used in Thuistone’s work on primary mental abilities 
(838) and Ruder’s work on primary interests (j,j(i). In I bin stone’s work 
the development of sufficiently refined and reliable instruments has been 
time-consuming and the lesults in terms of educational or vocational 
validity disappointing (see below, p. ip), and in Kuder’s it has taken 
thirteen years to develop an instrument with vocational significance (see 
below, pp. 445, and 459). This is not to decry the importance of such 
studies of primary abilities and interests, nor of the resulting tests; on 
the contrary, they undoubtedly arc the beginning ol a new era in aptitude 
and interest measurement and foreshadow tests which are more refined 
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;md more valid than any we now Iiave. Guilford’s work (316: Ch. 28; 317; 
319) has demonstrated this. But for most practical purposes it is still true 
that the Best available tests are those whkh do not over-stress the unitary 
nature and pm ity of the aptitudes 01 traits measuied. 

A foui th and final that at tet isn't of an aptitude should probably be 
added to our definition, namely, that it is lelatively (umlaut. If behavior 
or success is to be predicted, the entity upon which the prediction is 
based should be relatively stable. An aptitude which varied irrationally 
from one day, month, or year to the next would not provide a sound 
basis for predicting athievement at some futuie date. To put it statisti¬ 
cally, an aptitude which is itself umeliable could be neither reliably 
measured nor significantly correlated with anything else. This question 
of the constancy of traits has, as the literature of recent years makes 
amply evident (-qjj,501,7^0,832,917,918), been a prime source of disagree¬ 
ment among psychologists. I he attending contro\ersies are too involved 
for adequate discussion to be possible here. Jt seems wiser to side-step de¬ 
tailed discussion and simple to state the author’s conclusion that, whethci 
largely innate* or largely acquired, the aptitudes about which we know 
something appeal to become nystali/ed in early childhood and that after 
that the*\ ate lelatixely constant. They may then perhaps be affected In 
especially chastic or traumatic experiences, but can otherwise be thought 
of as not being appreciably affected by education, special training, 01 
experience. J his is not to imply, however, that specific practice on the 
items or materials ol the test itself will not, through practice effect, raise* 
the* subject’s test score; the contrary is true, but that does not indicate a 
ehange in the* degiee of aptitude. As demonstrated in a number of dif¬ 
ferent studies, interests and personality traits are crystali/ed later than 
aptitudes, in adole scence (1 j 1.5^8.771,775). The evidence for specific 
aptitudes and tiaiis will be \iewed later, as each test and the work done 
with it is studied in detail. 

Two other terms need brief definition. One of these is the word skill. 
It is used here, and in most discussions of abilities, as synonymous with 
proficiency, to denote the degiee of mastery already accjuired in an activ¬ 
ity. Thus a txping test is a test of skill, and a trade test is a test of pro 
ficiency. The other term is ability , which Bingham (94:19) uses to denote 
either aptitude or proficiency or both, lt*a\ing it to the context to indi¬ 
cate the meaning, and which Seashore and Van Dusen prefer to use as a 
synonym for proficiency but not for aptitude. In \iew of the convenience 
of having a general term the writer prefers to use ability to include both 
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aptitude and proficiency, using one of the latter terms when clarity and 
specificity require. T he term trait, it might be noted, is used as compar¬ 
able, in the field of interest and personality, with the term aptitude in the 
field of abilities. 

The Basic Aptitudes 

E. L. Thorndike once suggested that there are probably three types of 
intelligence: abstract, mechanical, and social. Since that time there has 
been a great deal of speculation and research on the nature and number 
of special aptitudes. T. L. Kelley used factor analysis and a variety of tests 
in order to study the question (.ji8), concluding from his data that apti¬ 
tudes may be classified as verbal, numerical, spatial, motor, musical, 
social, and mechanical. He provided also, in his scheme, for various types 
of interests. Spearman made another analysis (731), using other tests 
and a quite different method of factor analysis; since then he and his 
students in England have modified and elaborated his position, conclud¬ 
ing that there are one general or intelligence factor “g,” a number ol 
group factors such as word fluency, perseveration, and goodness of charac¬ 
ter, and many specific factors which aie found only in one test or situa¬ 
tion. Thurstone’s woik (838,83c)) in factor analysis and the organization 
of special aptitudes has probablv had 11101 e influence in America than has 
any other. Using the centroid method of factor analysis he isolated the 
following special aptitudes: number, visualization, memory, word fluency, 
verbal relations, perceptual speed, and induction. This research has 
borne fruit in the Chicago Tests of Primary Mental Abilities (see pp. 
132 ff), which measure six factors, number, verbal meaning, .space, won! 
fluency, reasoning, and memory. 

Two other factor analyses of aptitudes have iollowed Thurstone’s, each 
of them using a greater variety of tests and therefore isolating more 
factors than its predecessor One of these was made by the United States 
Employment Service, under the diiection of Shartle (735), and the other 
by the Army Air Forces, under Guilford's supervision (316,317). The lists 
of factors arrived at by each of these investigators are combined in Table 
2, in order to show how the list of presumably unitary human abilities 
lengthens as the investigations become more thorough-going. It will be 
noted, for example, that what Thurstone thought was one single aptitude, 
perceptual speed, was broken down into two factors, perception of sym¬ 
bols and perception of spatial forms, in the USES study, and into two 
apparently similar aptitudes in the AAF investigation. What Thurs- 
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tone’s study isolated as one factor, memory span, did not appear at all 
in the USES research because no memory tests were used, but was 
broken down into three distinct types of memory factors in the AAF 
analysis. As might be expected in the case of a program which devoted 

T able 2 

TUI' EXPAND INC, LIST OF PRIMARY ABILITIES 

According to Thurstone (839,), Siiartie (735}, and Guilford (316;. 

7 hurstone it)38 l SI'S {Shartle i f )4 r >) A.A.F. (Guiljord 

Spatial Spatial Spatial Relations I 

Spatial Relations II 

(Right-Left Discrimination) 
Spatial Relations III 
(Unknown; 




Visualization 

Mechanical Experience 

Perceptual Speed 

Symbol Perception 

Perceptual Speed 


Spatial Perception 

Length Estimation 

Number 

Nuniei i< ul 

Numerical 

Mathematical Background 

Verbal Relations 

Verbal 

Verbal 

Word Forms 

Memory Span 


Paired Associates Memory 
Visual Memory 

Picture-Word Memory 

Ind uction 


Intelligence 

General Reasoning 

Reasoning or 

Logic 

Analogic Reasoning 

Deduction 

Speed 

Aiming 

Sequential Reasoning 
Judgment 

Planning 

Simple Integration 
Complex Integration 
Adaptive Integration 

Psychomotor Speed 

Psvchomotor co-ordination 
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Kinesthcsis 
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Pilot Interest 


(Active-Masculine) 
Social Science Background 


considerable time and talent to the development of new types of tests, the 
Aviation Psychology Program battery revealed, when analyzed, far more 
primary traits than were isolated by the other investigations. Thurstone’s 
list included only eight factors, Shartle s 11, and Guilford’s as many as 
28. The list will no doubt continue to grow, as evidenced by the 28 per- 
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ant of the ariance in pilot success which was not, but might be, pre¬ 
dicted, if suitable tests were available. 

Other factors which may in time be isolated and added to our list of 
human abilities are suggested by Seashore’s (690) and Meier’s (519) studies 
of musical and artistic ability, discussed in a subsequent chapter. In the 
meantime, the lists in Table 2 provide a good basis for job analysis and 
test selection or construction. 

Thurstone’s method of factor anahsis provides lor the isolation of 
independent factors or aptitudes. For this reason, most ol the aptitudes 
named above ate relatively independent ol each other. Some, such as 
those normally included in the concept ol genetal intelligence, are more 
closely related, but the intcnonelations are still lowei than reliability 
coefficients, that is, too low to make a test of one aptitude or factor a 
good index of the score on the test of another factor. Tests of spatial 
visualization frequently have moderatelv high corielations with tests of 
intelligence, but this is an artifact arising lrom one or both of two causes, 
depending on the circumstances: Inst, tests of intelligence olten include 
tests of spatial judgment (e.g.. Arm) Alpha and the Ami) General Class¬ 
ification Test), and secondh, as Garrett has lecentlv pointed out (1*81), 
this and other factors which appear to constitute intelligence in chilchen 
become difleientiatcd with increasing maim it) and constitute, in 1 ealit), 
special aptitudes rather than aspects ol general ability, because their 
lates of matmation are similar, the moic abstiact abilities appear to be 
more closely lelatcd to each other than the mote conciete abilities. Tests 
of manual dexterities, not included in most lac tor anahsis studies, ha\e 
been anahzed to show that the concrete abilities which they measure are 
more discrete and have lower intercorrelations than do the more abstract 
aptitudes; this will be seen in the chapter 011 tests ol manual dexterities. 

Despite the demonstrated independence of special aptitudes, there is a 
tendency for groups of people who sc01c* high on a measme of “general 
aptitude” to make good scores on other tests, whether of special aptitudes 
or of personality traits. As Tciman pointed out in his “Genetic Studies of 
Genius” (819), the good things tend to go together, a statement amply 
borne out by varied psychological and social data on more than one 
thousand gifted children who were followed into adulthood (823). It is 
therefore not surprising that, in counseling practice, one encounters 
persons who make high scores on tests of academic aptitude and on almost 
any other test one administers to them, and others who not only make low 
scores on tests of general mental ability, but distress one* bv also making 
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low scores on any other instrument which is used in the search for some 
“hidden talent” which might he capitalized and built upon. It is well 
not to be overimpressed hv such cases, however, as it has been demon¬ 
strated (603) that they are outnumbered by those whose aptitudes and 
personality trails \ary considerably, giving them some assets and some 
liabilities. 

Method s of Measurement 

The most valid method of measuring an aptitude, that is, a unitary fac¬ 
tor in the ability to learn something, would be to find out what part of the 
activity or skill to be learned is most heavily saturated with that factor, 
have the subject learn it. and compare his rate of learning with that of 
other persons with comparable backgrounds. This is in most cases an inor¬ 
dinately expensive method, although selection on the basis of success or 
laiiure in an initial learning period is still the method used by many col¬ 
leges and prolessional schools which consciously admit two or three times 
as many beginning students as they expect to graduate and flunk those who 
make the lowest grades during the first year. It is the method that was 
used in selecting cadets lor pilot training both in the AAF and in the 
RAF prior to the development of adeejuate psychological tests, and it 
is that used by many businesses and industries even now despite the 
interest of many in taking advantage of the possibilities of scientific 
personnel selection and the great strides made in this direction by some 
life insurance companies, manufacturing concerns, banks, and retail 
establishments. Experience as well as theory has demonstrated that it is 
less expensive and better policy in other ways to analyze the task in which 
success is to be predicted, develop and validate tests for predicting 
a< hirvement in that task, and select on the basis of test and other personal 
data than to do a less caielul job of initial screening and depend more 
on selection on the job. In the same wav, it is less expensive, less discour¬ 
aging. and less difficult for a high school or college student, unemployed 
man or woman, or adult considering a transfer or change of work, to take 
a series of tests and analyze his experiences in order better to ascertain 
his ability to learn a new task or to adjust to an occupation than it is for 
him to try it out as a probationist or actual employee. 

Thcie are dillerent types of tests of aptitudes, each of which has its 
disadvantages as well as its advantages. The user of tests, as well as the test 
constructor, should he familiar with these. They will be briefly described 
here in terms of contrasting types or dichotomies. 
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Miniature tests may be contrasted with tests of abstract trails or apti¬ 
tudes. In the former, the task in which learning or success is to he 
predicted is reproduced in miniature and perhaps simplified form, as, for 
example, in the familiar lathe-type or two-hand spatial judgment and 
co-ordination test. This miniature test, used successfully in selecting 
shop students, duplicates on a smaller scale both the apparatus and the 
arm and hand movements of a lathe. In the test of abstract aptitudes the 
job has been analyzed and one or more of its essential characteristics has 
been abstracted and put into test form. Thus in the Macfhiarric Test 
of Mechanical Ability there are a series of tests of eye-hand co-oidination 
and of spatial judgment, one of which involves tapping three times in 
each of a series of small circles, another tracing a line through the 
variously placed small apertures in a series of barrier lines, and still 
another judging the number of blocks touching others in a series of piles 
of blocks. In this case the test bears no superficial resemblance to the 
original task or activity, let us say lathe operation, but some ol the 
essential aptitudes seem to be measured. 

The miniature tvpe of test has a number of advantages. Its lace 
validity or obvious similarity to the task in question makes it appeal to 
the examinee who is interested in such work, being a small scale* task, it 
is very likely to involve the same aptitudes and skills that ate tecjuiied 
by the criterion task and therefore to be highly correlated with it, that 
is, to be quite valid. One of the more valid tests used in the selection 
and classification of aircraft pilots by the Army Air Forces is the 
Complex Co-ordination Test, a “miniature” (lile-si/e but simplified) 
stick and rudder test which, with its airplane controls and tows of led and 
green lights, appeals to aspirants to pilot training and involves some ol 
the same ability to co-ordinate arm and foot pressures with each other 
and with visual stimuli which are involved in actually controlling the* 
plane in flight. That its validity, is not greater than it is (about . jo with 
pass-fail in pilot training [21 j]) is clue partly to the fact that response 
to kinesthetic stimuli, that is to the “feel” of the plane through what fliers 
call the “scat of the pants,” is not required by the test, and partly to the 
fact that many other factors are important in good flying, especially w hen 
the criterion is not just actual flight but success irr completing flying 
training. 

The advantages of the miniature test suggest some of its disadvantages. 
A test which seems to have a bearing on an activity which is, perhaps for 
a quite irrelevant reason, repugnant to the examinee will motivate him 
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in the wrong manner, as did any test in aviation cadet classification test¬ 
ing which seemed to the would-be pilots to have special bearing on the 
work of a bombardier. One may be able to get a more nearly true measure 
of the examinee’s aptitude or interest with a test the significance of which 
is not so obvious. Meinmeiz (75^) has demonstrated this with Strong’s 
Vocational Interest Blank, which is not a miniature test but which con¬ 
tains a number of items of obvious vocational significance. 

Another delect, of less immediate practical importance but more 
important theoretically and therefore ultimately in practice, lies in the 
miniatiue test’s unknown elements. Since it is a small-scale edition of 
the task, one has no objective way of knowing what psychological factors 
it measuies. This may be very well in selection testing, when the impor¬ 
tant thing is to get the highest possible validities with the least possible 
elloit, but in testing for vocational counseling it would necessitate an¬ 
other miniature test for each occupation or at least for each family of 
occupations to be considered in counseling. This would require an 
inordinate amount of test development and actual testing time. It is 
(learh more practical to analyze each occupation or acti\ity into its 
impoitant component factois, develop relatively independent tests of 
each factor or aptitude, \aliclate each of these, and weight each test for 
each occupation according to its importance in that occupation. This 
makes possible testing foi a large number of occupations with a relatively 
small number of tests. It is what was done in the Army’s aviation cadet 
classification program, one test being weighted heavily for pilot, moder¬ 
ated n for bombardier, and not at all for navigator, whereas another might 
be weighted heavily for navigator, moderately for bombardier, and 
slightly for pilot, according to the demonstrated relationship between 
each test and the criteria of success in each activity as expressed in corre¬ 
lation coefficients and multiple regression equations. The same technique 
is being used by the Occupational Analysis Division of the United Stater. 
Employment Service in the development of basic test batteries. What 
the abstract aptitude test loses in validity as a single test of one factor, 
it generally make s up as pat t of a battery of tests of known aptitudes 
combined to give an equally good or better prediction of the same 
criterion. Its print ipal defect lies in its lack of appeal to the less intelligent 
examinee, who is not challenged by an abstract task which has no meaning 
for him and who, if motivated in the right direction, is challenged by a 
test which resembles an everyday activity. For an excellent statement of 
the case for factoriallv pure tests, see Guilford (317). 
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Much of what has been written about miniatme and abstract trait 
tests applies also to performance and paper-and-pencil tests. A perform¬ 
ance test is one involving doing something with materials or apparatus, 
whereas a paper-and-pencil test requires only marking responses to written 
or perhaps pictorial questions on a sheet of paper. The lormer may be 
abstract and the latter miniature type, as in the case of the Minnesota 
Spatial Relations Test and the O’Rourke Mechanical Aptitude Test. In 
the Minnesota Test the examinee places pieces ol wood cut in the lorm 
of circles, cresent moons, oblongs, and various other shapes in tin* appro¬ 
priate holes cut in a board; the assembly has no meaning, other than that 
of matching dillerent shapes and si/es of objects and holes. In the 
O’Roiuke test the subject marks blank spaces to indicate which mechan¬ 
ical objects, tools, etc., are used together or for specific purposes, the task 
has meaning, in that the objects and processes are taken hom real life, 
arc more or less familiar, and ser\e important practical purposes. But in 
general performance tests ha\e the advantage of being more concrete 
and therefore seeming to be more meaningful to most people. Thus the 
Minnesota Spatial Relations Test, the real toimboaid, appeals to some 
examinees who rebel at the “umealitv” of the Revised Minnesota Paper 
Formboaid, a similiar although not identical task in papei and-pencil 
form. The reason lor this is suggested b\ the relationship between the 
two tests, expressed by a correlation coefficient of .r,q obtained by the 
writer in an unpublished study of 100 \Y\ souths, and b\ the correla¬ 
tions of the two tests with measures of academic aptitude, the* fonnboard 
having a correlation with the Otis S.A. Test of .25 and the* paper lorm- 
board Inning one of in the* same study. It appears that the paper-and- 
pencil test requires more abstract mental ability than the- perloimance 
test, probably because all spatial manipulations in the lormer must be 
made mentally, in abstract foim, rather than with actual materials, in 
concrete form. Paper-anci-pcnc il tests aie, because ol tile' ease ol group 
administration, cheaper than performance tests once they haw been 
developed, and are often cheaper to develop because of the materials 
involved. 

Another dichotomy is that of tests as contrasted with invenim ies. These 
concepts are probably familiar enough to need little 4 comment, other 
than the statement that the former are objective in that they require no 
judgments of sell by the examinee, while the latter are subjective in that 
they ask the subject to judge or describe his interests, traits, or abilities. 
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It is frequently stated that tests have right or wrong answers, whereas 
inventories have* no right or wrong answers, what is right or wrong in the 
latter depending on what is true of the examinee. This definition \< 
conect when applied to tests of intelligence and to inventories of person 
ality or interests, hut it is not conect when applied to tests of personality 
such as the Rorschach and Murray tests, which are objective and not self 
descriptive but which have no right or wrong answers. It is also not true 
of a type of personality test developed in military aviation, which is 
objective but in which the conect answer is sometimes the wrong one 
and a wrong answer is sometimes the “right” one, right, that is, for one 
who is likely to succeed in certain types of occupations. Tests have the 
advantage of being less affected by the desire to make a good impression 
and b\ lack of insight than imentories, but are sometimes more expensive 
in administration and scoring than inventories. This is especially true 
in the field of pcisonalitv and interest, although the developments in 
military aviation testing mentioned above, and some comparable civilian 
work, suggest that this may soon cease to he true in the field of interests 
i st e pages j;() 11.). 

A fourth dichotomv into which tests may be; classified is that of speed 
icMs as opposed to jxnen tests, illustrated in the intelligence field bv the 
()tis and the (!\VI) I ests. The relative importance which should be 
.itt.idled to each of these has long been a subject of debate in psvchologi- 
t.il testing, and also fortunatelv of research. Baxter (^2,5;;). lor example, 
has shown that the Otis Self-Administering Test of Mental Abilitv, ad¬ 
ministered as a speed test, is a good measure of what will be done b\ the 
same subjects when the test is administeted as a power test. Tinker (JS ] 7 > 
analv/ccl the levised Minnesota Papet F01 inboard as a measure oi speed, 
powei, and level, and found that the first two ate highly correlated. Both 
of these studies wete made with college students; if thev had been con¬ 
ducted with older subjects the tesults might have been different, as Lorgc 
t jSa) lias shown that older pe rsons do as well as sounder subjects on 
power tests, but are handicapped on speed tests. The advantages of 
briefer and uniform timing suggest the use of speed tests with younger 
persons, and power tests with persons in their forties or above. Perhaps 
the one valid reason for using power rather than speed tests w’ith younger 
subjects is that speed in the paper and pencil situation is not identical 
with speed in the life situation, but this is still a matter of supposition 
which has not been put to experimental proof. 
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Finally, there is the dichotomy of individual versus group tests, il¬ 
lustrated in intelligence testing by the Wechsler-Bellevue and the Otis or 
American Council Tests. In the former one has the advantage of being 
able to observe individual reactions and to adapt directions to the intent 
of the test, rather than having to follow their letter because a modifica¬ 
tion which would be fairer to one might handicap another subject in the 
group. In the latter social stimulation, competition, the safety of num¬ 
bers, the group example, and externally standardized conditions facilitate 
good results. 

In vocational testing the optimum conditions vary with the circum¬ 
stances and with the personality of the examinee. Sometimes it is better 
to test an individual alone, whether with group or individual tests; some¬ 
times it helps to have him take tests as one of a group. In school situations 
the latter is more often the case; in a consultation service for adults the 
former is frequently better policy, although small groups are acceptable. 
In vocational selection, candidates actively seeking employment are prob¬ 
ably just as well tested in groups, except in the case of applicants lor 
higher level jobs who may feel that they deserve individual treatment. 
In military selection and classification group testing probably gets better 
results whether tests are taken voluntarily or by prescription. As will be 
seen in the next chapter, group testing requires either small groups or a 
large group divided into sections each with its own proctor who can ob¬ 
serve it, supervise it, and give attention to special cases. 

In view of the frequent psychological (and financial) superiority of 
group testing it is desirable for vocational tests to be suitable for use in 
groups; they can just as easily be administered individually when that is 
preferable. A limited number of tests can be administered only on an 
individual basis or to groups of four to six examinees; these should ol 
course be used when they add to the efficiency of the battery and impio\e 
the quality of the diagnosis. There are no inherent qualities in either 
group or individual tests which make one type generally better than the 
other; they must, rather, be considered on the basis of their own validity 
and of the situation in which testing is to be done. Sometimes a test can 
be so constructed as to be either a group or an individual test in every 
sense of the term: the two forms of the Minnesota Multiphasic Person¬ 
ality Test are an example. It would be helpful to see some good studies of 
the validity of the two forms; the opinion of the test’s authors (356) is 
that when it is administered as an individual test the subject considers 
each item (printed on a separate card) more carefully and responds more 
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11 utlilully than when it is administered in the group form (printed in 
booklets) and one item closely follows another. 

The next chapter deals biiefly but systematically with methods and 
problems of test administration, both individual and group, from the 
point ol view of the user of vocational tests, leading up to the chapters 
which tieat specih< aptitudes and tests in considerable detail. 



CHAPTER T 


TEST ADMINISTRATION 
AND SCORING 

A PSYCHOLOGICAL, test is a measuring instrument. The reason for 
using measuring instiuments rather than guesses 01 judgments based on 
unaided observation is that psychological te*sts, like inlets, miciometeis. 
calipers, and stales, are moie accurate than the miked eye. Since the 
iundamental reason lor resorting to psychological tests is the accuracy 
ol a\ liicli they are capable, it should go without saying that tin* user of 
tests should take pains to gi\e them according to the cliications and to 
do eveiytliing possible that will assure accurate results. And yet. in every¬ 
day practice, one observes countless caieless errois in the use ol tests, some 
of them probably not important, but others ot \ital impoitame. \ lew 
such are described in the iollowing paragraphs. 

The Minnesota Spatial Relations Test was originally designed and 
standardized as a black lot inboard, the small pieces which lit into the 
varied shaped holes also being painted black on top (see ]>. liSr, lb). Al¬ 
though none ol the original publications dealing with it so state, it was 
aclministeied in the validation studies with the subject standing (peisonal 
letter from Professor Donald G. Paterson, dated August i j, 19 jh). And 
yet the copies ol the test, supplied by one well-known manuiactmer and 
publisher of test mateiials, are painted black with grcc;/ inset ts which 
probably change the* visual problem involved and pet haps make* it easier 
and the test is aclministeied in some consultation services with the subject 
standing, in otheis with him sitting, and in still others either wav, ac¬ 
cording to the client’s pielerence! The writer and a colleague* (Charles N. 
Morris) made a study of the ellect ol taking the tests in the*se* two chllere nt 
ways, w r ith tire somewhat inconclusive* finding that the* presumably im¬ 
proved perspective which is associated with standing above* the lormboatd 
tends to result in better scores. The* problem of color has not been in¬ 
vestigated, but it seems likely that, conn,try to widespread custom, the 
available norms can legitimately be used neither with the* gree n inse t ts 
nor when the examinee is seated. Wilson and Carpenter (<)‘;s) have* shown 
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that the norms for the Crawford Spatial Relations Test, based on the 
original aluminum form, are not applicable to the marketed wooden 
form. 

A consultation set vice psychoinctrist was giving the American Council 
on Education Psychological Examination to a client whose other test 
tesulis seemed conflicting. There was some informal conversation first, 
alter which the examiner rather casually read the directions and pro¬ 
ceeded with the test. While working cm the first timed part the client 
was puzzled and asked a question in the same informal way in which the 
pioceedmgs had been conducted liom the start. The psychoinctrist 
answcied the question in some detail, then, realizing that some time had 
been used in which the examinee should have been working on the test, 
allowed an extra minute for that part. As a result of both of these errors, 
the scene could be considered only a crude measure and the client’s in¬ 
tellectual status was still not definitely known. 

In scoring a test used in a large-scale testing program, a clerk failed to 
invert the scores in order to change the high time scores (score —number 
ol seconds; to low rank scores. This error resulted in giving high standing 
to those who had the least aptitude, and low standing to those who had 
the most, l or tunately, the error was caught in a loutine audit in time to 
piepare a new set ol reports; had it not been, time and money, not to 
mention human energies, would have been wasted when many of the 
poorer risks failed to make good in an assignment to which they should 
never have been sent. 

Pei haps the cause of errors such as the above lies in the very simplicity 
of the directions lor giving and scoring most tests. The novice’s reaction 
is that anvone can give 1 most tests, if he knows how to read, and it is true 
that thev are written out so that one should know exactly what to do. 
Put their simplicity is deceptive, and errors are frequently made both in 
lollowing the directions too slavishly when they are poorly written, iu- 
appiopriate to tire situation, or not sufficiently precise, and in departing 
from the directions when there is no need to do so, in ways not true to 
their intent. For this reason it is necessary to devote some space to the 
methods and problems of test administration, even when a background 
of knowledge of the field of measurement is taken for gr anted. 

Arran gem rnts for 7 'est Administration 

Freedom from distractions is one of the first considerations in providing- 
space for test administration. If the examinees are to be free to concer.* 
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irate on their work they must, obviously, not be disturbed by people, in¬ 
cidents, noises, or views which attract their attention away from the tests. 
This seems very simple, until one attempts to define distraction. Studies 
of the effects of noise on work have shown, for example, that typists are 
able to do as much work, with as high degree of accuracy, undci noisy 
conditions as under quiet conditions, although more strain tesults in the 
former (901:506-511). In the group testing of aviation cadets the presente 
of low-flying planes overhead, where they could not be seen, appeared to 
have no distracting eifect on cadets actually taking tests, although if an 
especially low-flying plane could be seen it attracted some eyes. Super, 
Braasch, and Shay (805) found that “normal” distractions had no eflect 
011 test scores in an experiment with graduate students. Appaiently a 
great deal depends upon how much the examinee wants to exclude the 
distracting factor from Ins attention: if he is well motivated, incidental 
noises will not bother him, whereas if he is not interested in doing well 
on the tests he will seize upon the slightest excuse for attending to other 
matters. As one cannot always take good motivation and good woi k hab¬ 
its for granted, the examiner must take what precautions he can to insure 
freedom from distractions. This means that he should ha\e the use ol 
a room through which there is no passage and to which no one needs to 
have access during testing; a room without disturbing \iews of passersby 
in the corridor or outside the windows; a room not affected by noise in 
adjacent rooms, corridors, or play space; a room in which the temperature 
is normal and constant. 

Good workmg space for the individual examinee is a second consider¬ 
ation, whether testing is on an individual or group basis. In the former 
this means a table for the examinee, so placed that the examiner can sit 
opposite him, and a second table so placed that the examiner can reach 
and manipulate the test materials easily and inconspicuously. I11 group 
testing, good working space consists of a Hat top large enough lor the 
examinee to be able to rest his elbows without touching the persons next 
to him and to spread out his papas without exposing them to the exes of 
his neighbors; this may be made somewhat smaller on especially con¬ 
structed testing tables by building upright partitions about ten inches 
high to shut off the view of the neighbor’s work. Tablet arm chairs such 
as are used in many college lecture rooms are not desirable for timed 
tests, especially if separate answer sheets are used, as Traxlcr (867) has 
demonstrated. These two considerations of sufficiency ol space and pri 
\acy of work are disregarded with surprising frequency. One can some- 
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times make the best of crowded conditions by using more than one form 
of the same test. 

Advance preparation of materials insures having everything needed 
during the testing (scratch paper for some parts of some tests is frequently 
(orgotten in large-scale group testing), cuts down the time needed for 
test administration, and results in better morale among examinees. In 
grou]) testing this involves preparing a list of items needed, from pencils 
to test blanks, and of the quantity to be ptovided; sorting the materials 
actc>iding to type and sequence in which they ate to be used; and count¬ 
ing them out according to the number of subjects to be seated in each row 
.ind the number of rows in the room. This last step saves a great deal of 
time and confusion in handing out materials, and pre\ents the pocketing 
ol excess copies of confidential test booklets. In individual testing the 
sups aie essentially the same, but more attention is focused on placing 
the materials on the examiner’s table for maximum availability during 
testing. 

Good pun'toj mg is a pi erequisite of good testing which results onl\ 
hom securing the assistance of enough proctors and seeing to it that 
they nuclei stand their woik. The experience of persons woiking on huge¬ 
st ale testing programs with both students and militaiy personnel has led 
to i('cognition of the 1 act that, when large numbers ate being tested, 
time must be one pioctor or testing assistant lor e\erv 20 or 25 exam¬ 
inees; il fewer plot tors are pro\ided, supervision is likelv to be inade¬ 
quate. I he functions of the proctors are to distribute test mateiials 
collect them alter use, pi ovule sharp pencils when needed, be alert toi 
piobleins arising Irom inadequacy of materials (e.g., a blank page wheic 
there should he- priming), fioin insufficient grasp of directions the under¬ 
standing of which is assumed once directions ha\e been given (e.g., mark¬ 
ing answers on the booklet instead ot 011 the separate answer sheet when 
provided), bom “bugs” or delects in the test or test directions which 
should be iccoided lor the future improvement of the test, and from 
abnoimal pci sonali t\ traits or poor motivation on the part of examinees. 
Proctors work most effectively when they have not only studied the tests 
and test directions, but also taken the tests and administered them. In 
large-scale testing operations which have considerable continuity ti e 
establishment of training programs to provide for these experiences is not 
uncommon; in other testing programs the administrator should make the 
best provisions for familiarizing the proctors with the tests and with the 
problems which may be encountered in administering them that the si r u- 
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ation permits. Testing assistants easily get the feeling that their luin ho; 
is a routine one with neither responsibility not gloiv. I xctxthing the at 
ministrator can clo to make them aware ol the responsibility thcx <an 
and thus insure their careful attention to their work is then lore woith 
of consideration. 

The duration of testing, assuming that mote than one 01 two tests au¬ 
to be given, depends on the matutitv and motivation of those takin g the 
tests. Jn testing lor the \o(at/ona/ gu/dame of high school funiois and 
seniors and of college freshmen the writer has found two da\s ol testing 
and filling out records, consisting ol three horns ca< I) mottling and two 
Hours after lunch, quite acceptable to the students. 1 he College I nuance 
Examination Jioaid (iu;) found that fatigue playd no patt in sis hour s 
of testing. When motivation is no: so strong and co-opet anon not so good, 
periods of two horns max he all that ate vise, and thete mav have to be 
fewer pci iods. For example, when testing returned (omhat flic is in an 
At my Air Fences Redistribution Station it was found th.it two time bout 
test sessions, both on the same dav, wete feasible, but the examine is and 
proctors needed special skill at times in the handling ol tee ale itt ant ofh- 
ccis and men who balked at the length ol the testing pet tod. hxcu in this 
militaiv situation tatt went lmther than authontx, and one ol the- best 
examiners with pool lx motixated returnees was a endian woman psy¬ 
chologist who knexx' flow both to jest with and to motile t belhgetent gun¬ 
ners and bomhanlii‘1 s. Making char to examinees win thev a/e taking 
the tests and hove the lcsults will aflect them (disc ussed itt a subsequent 
paragraph) and letting - them know at the slat t just how Jong the test 
sessions will last ate two essentials to the winning of c o-opet ation in test 
administration. 11 the examinee wants to imdeistand himself, wants to 
get a job, or wants to help olheis like himself (the desite to help oiltet 
fliers who weie going to combat motixated many tetinnee s in the com¬ 
pleting of research questionnaires), be can put. in mote (ban a full clay 
of taking tests. 

Provisions for the recording of the proceedings should also be made 
ahead of time. Decisions should be* made as to the txpe of records to be 
kept, and appropiiate forms proxided. 7 he times at which tests ate begun 
and stopped should be recorded, as they arc occasionally needed later 
when checks are being mack' on accuracy of timing. Problems arising 
during testing should be noted, for their value in interpreting the be¬ 
havior of individuals or the significance of the test results. Examiner and 
proctors both haxe a part in this work. 
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Testing individuals in groups is a practice frequently made necessary 
by lack of space and personnel even in consultation services and business 
ent< prises where schedules and needs vary from one person to the next. 
Wh n theie are a number of persons to be tested with different tests, and 
fewer examiners and rooms than there are batteries to be administered 
(a < bionic condition in guidance centers and personnel offices), there is no 
alternative. I he space must then be arranged so that individuals or small 
groups can be sufficiently isolated from others in the same room so that 
the\ can work undisturbed by directions not intended for them, stocks 
of materials must be kept in such a wav as to make them readily available 
to all examiners as needed, and each examiner must develop skill in using 
s< veial stop watches or chronometers and in shifting from individual to 
individual as timing requires. .Space must, 0/ course, permit easv circula¬ 
tion of examiners and of entering and dejjurting examinees. 

The Preliminaries to Testing 

The diet king of all arrangements discussed in the preceding section is 
naturallv the fust prelrmm.uv prior to the* starting of testing, in order to 
be sure that even thing nccessan is reads for things to go as planned. Test 
administration seems so \er\ simple to the average examinee that its 
smooth progress is important to rapport. 

The tnhodut tory or moth atmg talk follows immediately after the ar¬ 
rival and seating of the examinees. In prior informal contacts exam.trees 
olten ask questions of examiners or proctors, thereby demonstrating the 
widespread need lor orientation to that which is to take place, even when 
testing is voiuntarv and sought after. The knowledge that lie is about to 
put himself to a test or proof makes the examinee somewhat insecure and 
self-conscious, so that lie* wants reassurance 01 feels the need to be some¬ 
what aggressive and belligerent. The examiner or proctor, knowing this, 
can accept his remarks in a calm and friendly way, stating perhaps that 
something will be said about the nature of the tests before they are 
started. The motivating talk should be brief and to the point. Its ob¬ 
jective is to set the stage for effective testing by giving the subjec t some 
idea of what he is going to do and how lorrg it will take him, and to make 
him want to portray himself accurately on the tests by relating the taking 
of the tests to his goals. In vocational counseling the goal is self- 
understanding and better adjustment to the world of work; in vocational 
selection it is the obtaining of a job in which he will find success and sat¬ 
isfaction. "These themes can be elaborated upon in ways appropriate to 
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the age and occupational level of the examinee, but it is well to be sure 
that the goals are real to those being tested and that the language used 
in discussing them is appropriate both to the examiner and to the exam¬ 
inee. 

The Sequence of Tests 

In formal testing programs the nature of the tests which need to be 
given to a particular individual or group determines to some extent 
the sequence of tests which can be administered. Within these limits, 
however, it is desirable to arrange the order in the way which is likely to 
interest the examinee most and to get the maximum co-operation from 
him. As a rule, the following principles ha\e been louncl elective in ar¬ 
ranging the sequence of tests. 

The first test in the series should be something of a buj)er, one on 
which the examinee can warm up, get some self-assurance, and de\elop 
some interest. For this reason it should not be too hard, should be ida¬ 
tively impersonal (i.e., neither an intelligence nor a personality test) and 
objective, and should have “face validity” or seem pertinent to the icason 
lor taking tests (i.e., it should, in the case of pilot selection, look like a 
test that has something to do with fl\ing an airplane). 

Next should come a test or tests with long and difficult dilections, 
difficult content , or other characteristics which make desirable an a hit 
mind, ability to concentrate, and willingness to apply oneself. 1 ests ol 
this type might come after one or two of the fust type, depending on the 
number and length of those in each category, or they might alternate. 

Tests which the examiner prefers not to have remembered in detail, 
if there are such, should come late in the sequence but not at the wry 
end. Personality inventories which contain touchy items or which might 
be joked about afterwards are in this category. If taken alter the difficult 
tests and before all the othei tests haw* been given they piovidc some 
variety and relaxation when it is needed and are likely to be hall-forgot¬ 
ten by the time testing is finished. 

The last test should be relatively shoit and pleasant , to help the exam¬ 
inee leave with a good taste in his mouth. If a group is being tested to¬ 
gether, it is often desirable to let the test be a speeded test so that all may 
stop and leave at the same time, as having some leave while others are 
working tends to make the latter finish hurriedly or carelessly, and keep¬ 
ing those who have finished for more than a few minutes is dilhcult be¬ 
cause of restless eagerness to leave. When testing individuals in a group 



TEST ADMINISTRATION AND SCORING 


79 


with different tests, imtimed tests or inventories may be satisfactory to 
finish with; the individual can be left more or less to his own devices and 
others can be given attention. 

informal testing characterizes much counseling work carried on en¬ 
tirely by one counselor utilizing interviewing and other techniques 
(J 13 ). Then nothing approaching a “test battery" is administered, but 
certain tests are used as questions come up on which it is believed they 
will throw light. Jn such testing the question of sequence is settled by the 
factors making testing seem desirable: the question to be answered pro¬ 
vides the motivation for the test being used. The problem is then cn- 
tiiely one of selecting an appropriate test rather than one of arranging 
the tests in the best possible order. Bases for test selection are made cleat 
in later chapters of this book. 

/ allowing I)irc( tions in Testing 

It has already been pointed out that the very ease with which tests are 
administcied breeds errois. Gtoup test administration is likely to be 
thought of as inquiring less skill than other testing operations; group 
test proctors in aviation cadet classification testing icfcrred to themselves 
colloquially and collectively as the ‘‘bunion brigade." Unless examiners 
and pioctois ate awaie of the case with which errors are made and are 
challenged by the need for cate, they are likely soon to be guilty of un¬ 
knowingly modifying the introduction to testing in such a wa\ as to 
change the examinees’ motivation for better or for worse, of changing 
diicctions in wavs which gi\e them either more or less help than they 
should have in taking the test, of answering questions which gi\e then 
an unfair athantage in comparison with the groups on whom norms were 
e stablished, and even of allowing too much or too little time in which to 
take' the* tests. 

The wiiieis of motivating talks and of test directions intend to convey 
ideas to the examinees which will motivate them in certain ways and lead 
them to woik according to certain methods. If, therefore, the examiner 
nuclei stood exactly what the test constructor intended to convey, and if 
he were able to express that idea just as clearly as the test author in words 
of his own, thcie would be no reason why he should not rephrase the 
diicctions to suit himself and vary his statements from time to time. Un¬ 
fortunately, however, experience has demonstrated time and again that 
while the modifications made in test directions may be just as clear to the 
examiner as were the originals, they are rarelv if ever as clear to the 
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examinee. The reason for this is obvious enough: the directions supplied 
with a well-constructed test have been tried out a number of times on 
subjects like those for whom the test was developed before it was finally 
published, and each lime they were rewritten somewhat and improved 
after criticism by examiners and examinees in order to make sure that 
the intended meaning and the understood meaning were identical. Ob¬ 
viously, the directions more or less casually phrased and even more cas¬ 
ually tested by the user of a test are not likely to be as cleatly and as 
uniformly understood as those that are printed with the test. Only a 
highly skilled examiner who knows both his test and his subjects well 
should allow himself the privilege of improvising or modilving directions. 
At the same time all examiners need to scrutinize the printed directions 
carefully to be sure that they are well drafted. If they are not suitable 
for the group in question the suitability of the test itself may be open to 
question; if the test is suitable with different directions, tin* norms may 
no longer be applicable. "These matters are subject to empirical check, 
and if judged important enough the answers may be found by experi¬ 
mental methods. It is good practice for examiners to have a manual or 
loose-leaf notebook of test directions, and to know these well in order 
to facilitate reading them while adminixuating tests. 

Examinees’ questions need to be viewed bv the examiner as possible 
requests for changes in the test (Elections. If the inhumation asked for 
was supposed to be conveyed by the directions, and if understanding ol 
the directions was supposed to be achieved before beginning the test 
(rather than being a part of the test), the examiner should answer the 
questions promptly and concisely. If, on the other hand, answer ing the 
question would give the examinee an understanding ol the test or infor¬ 
mation which the directions were not intended to convcv, to do so would 
be to make his score meaningless, or at least impossible of comparison 
with those of others who took the test and on whom the norms were 
based. In such a case the best answer is “That’s for you to decide” or some 
equivalent which makes it clear that the examinee must find the solution 
himself. It should be stressed that the number of questions asked, and 
their legitimacy, depends to a very considerable extent upon the manner 
of the examiner. If he gives directions in too businesslike 1 and cold a 
manner questions which should be asked will not be voiced; if he is too 
informal and friendly too many unfair questions will come up; but if he 
gives directions clearly and pleasantly he will meet with an optimum 
number of questions concerning matters included in the directions (few 
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but all necessary questions) and a minimum of questions of types which 
he should not answer. 

This leads to the topic of the examiner’s voice and attitude, both of 
which have considerable effect on the attitudes of examinees and there¬ 
in] e on the validity of the tests which they take. An examiner whose 
clear, confident, and friendly voice and interested alert manner are noted 
by the examinees gives them the feeling that the tests are important, 
interesting, and worth taking seriously; one who is lackadaisical in man- 
uei, feailul in fiont of a group, or careless in his speech is not likely to 
create in his subjects attitudes which make for serious application and 
genuine co-operation. When proctors assist in test administration, the 
manner in which they walk the aisles and watch examinees or stand idly 
by with their minds obviously far away is equally important. 

The need for (ncuiacy of timing has ahead) been mentioned. In ad¬ 
ministering tests of manual dexterity or other aptitudes best measured 
by apparatus tests this necessitates a stop watch with its easily controlled 
sec oriel hand. Most paper and pencil tests, however, can be timed with 
sufheient accuracv In means of the second hand of an ordinary watch if 
the hand is long enough. A watch with a sweep-second hand is even bet¬ 
ter, although still not as easily used as a stop watch because the examiner 
must watch the second hand enough to count the number of times it 
goes around. J his is best clone by tabulating on a pad. 11 a stop watch is 
available, the iieedom to spend more time watching examinees and less 
time looking at the second hand is desirable. As stop watches are some¬ 
times erratic, it is advisable to check their operation before testing, and 
to note the staiting time on one’s wrist watch or on a clock (or to have a 
proctor time with a second stop watch) in order to be sure that watch 
trouble does not pi event accurate timing of a test that is actually under 
way. Einallv, when testing huge groups it is good practice to instruct 
examinees to put their pencils down and lean hack in their chairs at the 
word “Stop,” thus making it easv lor examiner and proctors to insure 
the lespecting of time limits on strictly timed paper and pencil tests. 

Observing the Behavior of Examinees 

Careful observation of the manner and attitude of the examinees has 
long been standard practice in clinical testing, and has been carried over 
into vocational testing by those with clinical training. Baumgarten 
published a list ol tv pcs of behavior which should be looked for by the 
user of vocational tests ( 51 ), and Bingham translated and converted this 
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into an Examiner’s Checklist (94:229-235). An abbreviated and somewhat 
modified form of this checklist is included here because of its value in 
suggesting types of behavior which may be worth noting and the possible 


Examiner _ 

Date 

examiner’s checklist 

_ Subject 

_Test _ __ _ 

— 


I PRELIMINARIES 

A. Attention to Examiner 

UEIJ WIOR 

T 

1 

IXI ERPRE 7 A TIO V 


1. Attentive___ 2 l.onks around __ 


II Qurstio 
Ycs_ 

C Speed of Approach to Test 
1. Rapid _____ 2 Slow_ 


E Oonfidenre in Approach Dispaiar 

I Dispa; arcs tusk_2 Lnthusiasl it .____ n Self. _ 


F Judgment of the Task 

l \ in. aI___ _ u GestiiM 



ii. Exrcniox 

A. Starting 

i Deliberation 

a Yes_b No_ 



2 False Starts 

a Yes._b \o_ 



a'. Fersevcrates_ 

b r Chances 


B At \N ork 

i. Direction of Attention 
a. To task_b A\ 

-sa>- 


2 Decree of Attention 
a Concentrated_ 

b Distracted_ 




_ 


ion of 1‘eclines 


4 Bods Movements 

a. Co-ordinated_b Not co-oidmatn 

5 Hand Movements 

Yes_ Yes_Yes_ 

a Appropriate b Sure c Quick 

No_ No. _____ No_. 


G Work Tempo 

a Quick_ b Slnw_ _ 

7 System 

a Yes_ b No.. . __ 


ft Regularity 

a Yes_t» No_ 


1 ^Crescendo 

2 IDirnimiendo_ 

3)Altct natc 1 >_ 


9 Care and Neatness 

a Careful_ b Sh>pp\_ 


C. Frustration-Tolerance 
i Asks no help 

a. Indifferent _ 
b Gives up __ 

c. Solves problem_ 


2 Asks help_ 

a. Once_ b. Repeatedly_ 

3. Receives help _____ 

a Indifleientlv . d Critically _ 

b IlappiK t Trustfully_ 

< (ban lulls _ __ _ I WnhoIUn.se_ 
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D Obrrlirnre to Instruction"! 

i Exact_ ‘-i With Deviations. 


Ill ATI I'll 1)1. in WARD RIMED RMANCE 
A Nollies Mi"- lakes___ 

a In pirn rss_ _ h At end_c Sporarlicallv_ 


11 Mistakes Unnoticed_ 

Show sin hup__ 

a Pleasure_1> Vexation._ c Not clear_ 

IV CO.XDICI Al l I'M VI'S! 

A Sd( nt and Watchful_. _ 

II Announces RrsuJt_ 

('. Asks Evaluation_ 

1 ) Expresses Feelings__ 

a Satisfaction . b Wxat 

E I eaves Materials 

a In order_t> In disordrr_ 


significance of such beha\ioi, and because it is useful in training psy 
chometrists, counselors, and personnel workers to get more than a test 
score from the administration of a test. In actual practice, however, such 
elaborate forms are rarely used: instead, the examiner who has learned 
to observe behavior in testing simply makes note of anything which he 
believes may be significant and includes it in his test report. Beginners do 
well to use a form such as this for some time, in order to learn what to 
look for and to gel the habit of noting it; once the habit has been acquired 
the simplei method can effectively be adopted instead. Examples of nota¬ 
tions of bchaxior in taking tests and of their use in interpreting test scores 
are given in Chapters lm to 2% in which methods of reporting test ic- 
sults are discussed in some detail and the content of test reports is illus¬ 
trated. 

One word of caution should be said at this point. Some clinicians 
delight in telling how much is learned about a subject from the way in 
which he attacks a problem, from his procedure in putting together a 
set of Wiggly Blocks or fiom his persistence in working on a difficult 
mechanical problem. These symptoms are extremely interesting, and it is 
easy to be carried away by the tendency to build an ambitious account 
of a personality upon them. They are, however, minute segments of be¬ 
havior observed in a limited situation, and there is no real evidence that 
the behavior so manifested is typical of behavior in other situations. The 
possible insights which may be gained from watching a person solve 
arithmetic problems while taking a standard test should not be missed, 
but it should be lemembered that it is the score which has been proved 
reliable and which is known to be related to behavior in other situations, 
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not the method of approach or the reaction to frustration. At the same 
time, a knowledge of these latter helps one to understand how and why 
the obtained score was obtained, and provides data which may, with many 
other items from other situations, help in the consli uction ol a picture 
of the counselcc’s personality. 

Condition of the Examinee 

Those who take objective tests sometimes claim that they are too ner¬ 
vous at the time of testing to do themselves just it e, or that they weie 
not in good health at the time and were therefore handicapped. In certain 
extreme cases these claims are no doubt warranted, as for example in that 
of a married man who took a test the morning alter a violent quanel 
with his wife and a subsequent resort to alcohol: he was in the 1 second 
decile of the comparison group in that testing, but on a retake tlnce 
months later, alter a divorce and when clear-headed, he was in the high¬ 
est decile. Despite these occasional rather obvious and verified cast's, theie 
is a good deal of skepticism concerning most such claims. Ocldlv enough, 
there has been very little tesearch on these problems. 

TJie infhicnie of tension on the intelligence test stoics ol thilthen was 
investigated by Yager (9j7) with a gtoup ol lortv boys lrom ten to twelve 
years old. They were first tested under normal conditions, then under 
tension presumably produced by threats and evidenced by phvsiologic ,il 
changes. Thirty of the boys made bettei scores, but ten showed losses 
The tendency to improve or to bieak clown under tension was related to 
emotional stability. This experiment appears to confirm the belief that 
only a few persons, and those the neurotically inclined, suflei from the 
tension-creating conditions of testing. 

The effects of health were investigated by British army psychologists 
in a study referred to by Vernon (897). Standard selection tests were taken 
a second time by women recruits, and dillerences were related to men¬ 
strual phase. The effects of menstrual cycle on lest scores were found to 
be negligible. Another group of over 1000 were asked at test and retest 
whether or not they felt able to do themselves justice; less than four per¬ 
cent claimed not to be able to do themselves justice, but their scores were 
not significantly different from those of the others. Those suffering from 
colds showed a slight, but not significant, drop in scores. 

Another study which may have some bearing on this problem is a 
report by Glick (292) that freshmen who took the college intelligence 
tests during the New England hurricane of 1938 made scores 20 peicent 
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higher than those of other years. When subsequently retested, they were 
shown to be a normal group. Click suggests that their “hurricane intelli¬ 
gence” may have been the result of stimulating effects of ozone in the ah 
at the time of the hurricane. 

The sc studies, like those of the effect of distractions on test scores, 
suggest that the minor illnesses which do not confine one to bed can be 
dismissed as having no appreciable effect on test scores, but that the more 
serious impairments are sufficient justification lor questioning a test score. 

Scoring 'rests 

The methods of scoring the tests which are widely used in vocational 
guidance and selection are objective and generally quite simple. The tests 
in which scoring involves judgment and training on the part of the 
examiner are used almost exclusively in clinical work; exceptions to this 
statement are the clinically interpreted Wechsler-Bcllevue Test of Intelli¬ 
gence, sometimes used as a special check in cases of adults who may be 
vcibalh handicapped, and the Rorschach Psychodiagnostic, which is 
o( casionalh used in connection with executive selection. Both of tficse 
requiie e xtended training of a type which is given in special courses, 
and are the subject ol special books (91.J, 5G, r t 7, ro8, 433). Most of the tests 
which are width used and width are discussed in this book are scored by 
means ol stent ils 01 kevs which can be used by a clerk or by clerically 
operated scoring machine; the others have simple time scores. For this 
reason onlv two points need to be made concerning the scoring of \oca 
lional tests. 

Tire fust of these is, again, familiarity with the directions. Persons scor¬ 
ing tests must first be sine that they understand the procedure. A loutine 
tan then be* established that fits the immediate situation. If hand scoring 
is in order, t lear anti dm able kevs or stencils should be made, and scores 
should he systematic a Ilv calculated and entered on record blanks. If 
machine scoiing is clone (it should be in any large-scale operations) this 
woik will either be performed by a commercial scoring organization or 
by an especially trained scorer who is competent to set up procedures. 

The second point has to do with checking. Even the best of scorers 
make* mors, as illustrated at the beginning of this chapter. For this reason 
all scores should be c hecked by another person, at all stages, if hand scor¬ 
ing is utilized. If machine scoring is used, all manual steps should be 
checked. II an accurate instrument is worth using, it is worth insuring 
its accurate use. 
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INTELLIGENCE 

Naturf and Rolf 

INTELLIGENCE has frequently been defined as the ability to adjust to 
the environment or to learn lrom expei ience. As Garrett (uNi) has pointed 
out, this definition is too broad to be \ery helpful in pi actual woik. One 
might therefore resort to an operational definition, and say that intelli¬ 
gence is the ability to succeed in school or college: such a definition would 
be justified by the fact that the criterion used in standardizing intelligence 
tests has generally been one of school placement and progiess. 1 his line 
of thought is illustrated by the tendency of mam school and college 
officers to talk in terms of scholastic aptitude and scholastic aptitude tests, 
thereby implicitly limiting the application of such tests to the snnations 
in which they have been proved valid and dodging the issue of their iole 
in other types of situations. 

An equally operational, but more psychological and theieloic mote 
generally applicable, definition is suggested by Ganett in the- papn 
jeferred to abou*. “Intelligence . . . he states, “includes at least the 
abilities demanded in the solution of problems which requite the com 
prehension and use of symbols.” T his definition is operational in that it 
is based on an analysis of the task imohed in solving the pioblems pie- 
sen ted by an intelligence test. It is bioader than some test-based defini¬ 
tions because it applies not only to the tasks presented by the test, but 
also to the tasks presented by the school or college couises, success in 
which it is designed to pi edict, ft is bioader e\en than this, because it 
allows for the value of such tests in pi edic ting success in certain types of 
occupations, namely those in which job analysis shows that it is necessan 
to comprehend and use symbols. And it has the additional advantage of 
taking into account the important w T oik of the past ten or fifteen yeais 
which demonstrates that intelligence is not one aptitude but a constella¬ 
tion of aptitudes. As these components of intelligence apparently vary 
in importance in different occupations according to the type of symbol 
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most frequently used in that occupation, this advantage is of great 
practical significance. 

Two closely related questions normally come up for discussion at this 
stage: those of the innateness and the constancy of intelligence. During 
the 1930’s they were the subject of much debate and disagreement among 
psychologists, an excellent overview of which is provided by the 39th 
Yearbook of the National Society for the Study of Education (920); refer¬ 
ence should also be made to a paper by Stoddard (760) expounding the 
environmentalist point of view, and to papers by McNcmar (501), Thorn¬ 
dike (H32), and Wellman et al. (917, 918), in which detailed questions of 
the methods and results of nature-nurture studies are examined at length. 
The topic is much too complex for treatment in a handbook on voca¬ 
tional testing. 'The reader who has not studied sources such as those 
refened to, or who has not sufficient time to do so, must rest content with 
the general conclusion reached by this writer. This is that wheicas both 
natuie and nurtuu* plav a pan in the development of intelligence, mental 
abilitv as indicated by the intelligence quotient is relatively constant 
horn the time a child enters elementary school until late adulthood. It 
is tine that the* obtained I. O. will vary some after the age of six, but this 
is genetallv more a function of the tests, which are often not strictly 
computable at diflcrcnt age levels and which are in any case subject to 
ett01 s <> 1 measutement, than ol the individual. Some changes which are 
too gieat to be explained by these causes are the result of emotional 
conchtions which invalidate the score of one test, or of organic changes 
tesulting from disease or injury. That there are other changes, not ex¬ 
plained h\ any of these factots and atttihutable to changes in the en¬ 
vironment which modify the functioning intelligence, lias not been 
demonstrated to the* satisfaction of all competent judges with persons 
of elementary school age or older. 

Intelhgnn c and Educational Success 

T he role* played In intelligence in educational achievement has been 
frequemlv studied. Compi offensive icviews of the research are available 
in Pin tiler (hop Ch. 10-12) and in Strang (jhUiyx-cj'j). Our attention 
will be focused on certain points, an understanding of which is needed in 
the use of intelligence tests in educational and vocational guidance, and 
on some data illustrating those points. 

Different curricula have been found to require or to attract different 
degrees of intelligence, whether at the high school or at the college level. 
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In general, students in scientific and liberal arts courses have the highest 
intelligence test scores, with those in commercial subjects coining next 
and trade courses last. In one nation-wide study (417) the median I. Q. ol 
high school boys in different courses was as follows: 

Table 3 

MEDIAN I.Q.’s OF BOVS IN HIGH SCHOOL COURSES 


Course 

Md. 1 . Q. 

College Preparatory 
(Technieal Schools) 

114 

Scientific (General Schools) 

108 

Academic 

106 

Commercial 

104 

Trade 

9 2 


The exact figures vary from one community to another and from time 
to time. It is therefore necessary to ha\e local norms in actual counseling; 
in fact, not only the trends, as indicated by averages, aie necrssaiy, bur 
even more needed are minimum critical scores which show what score a 
student should make in order to be a good risk in each tvpe ol training. 
The impoitance of local norms is further illustiated bv the fact that in 
some cities, Buffalo for example, tlieie are trade* schools which ollci such 
attractive training that entrance is quite compctiti\e, wheieas some* of 
the general high schools attract students ol less abilitv who, for cultural 
reasons such as the prestige of academic training, want tin- tiaditional 
education. It should be rcmcmbeied, too, that if general intelligence weie 
broken down into its component lac tors, the group which ranked highest 
on one might well rank lower on another. 

Differences in the intelligence scores ol students in diflerent nisiitu!ions 
have been found which, like auricular dillerences. are in line with 
popular expectation. Some of these can be expiessed in genetali/ations: 
liberal arts college students tend to be intellectually supenor to teachers 
college students, those in small rural colleges tend to be- inferior to those* 
in large urban universities, and those in highly endowed piivate institu¬ 
tions tend to be more able than those in state univeisities (at least when 
freshmen classes arc compared) or in denominational colleges. The docu¬ 
mentation for these statements is provided by the periodic analyses of the 
results of nation wide testing programs such as that of the Amciican 
Council on Education (840), in which some 350 colleges and univeisities 
of all types usually participate. After World War 1 studies made in a 
number of universities with the Army Alpha Intelligence l est gave 10 
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mills for a larger group of identified institutions than more lecent 
publications, which generally use code numbers rather than names. The 
data have been collected by Pintner (004: 29b); converted into Otis I. (.). 
ecjnivalents, these reports show that some twenty sears ago the median 
1 . Q. at Vale was 141, at Oberlin 12 j, at Ohio State 120, at Penn State 1 17. 
and at Purdue 115. The over lapping ol scores was no doubt considerable, 
but the ranges and (jiiai tiles are not reported. 

T he American Council data referred to above are lor entering freshmen, 
which means that the normal elimination as a result ol academic failure* 
has not vet taken place. Ibis is especiallv important at state institu¬ 
tions which are obliged to admit great numbers ol high school graduates 
who subsequently fail to keep up with their classes, and which therefore* 
have freshman attrition rates as hi ,v> and Go percent. In colleges 

using more stringent selection standards the differences in the average* 
ink Higenee ol freshmen and seniors is much smaller. Reallv adequate 
data on intelligence and college sue cess would, as in the* case of curricula, 
provide minimum critical scores for each college. Individual colleges, as 
will he seen shotllv. have such data lor their own use. The published 
material, however, is sirnplv in ter ms of freshman averages and variations 
In 1958, for example, the colleges using the A.C..L. Psvchologrc al 
Examination (8p>) reported lreshmen medians which, when convened 
into Otis I. O. equivalents, lange from <>f to 122. the median * ollege 
having a median freshman I. Q. of 10S. I he interquartile deviation*, were* 
such that the college with a median freshman I. Q. of 9} had a freshman 
c lass in which one* loin til ol the* students had I. Q. ecjuiv alents of less than 
90, and onlv one fourth exceeded too. 

Data loi one liberal aits college, Oberlin, have been reported in some 
detail bv f far Ison p;jS. gpj'j, who has set an example which, il followed 
In other college officials, would be ol general benefit in improving the 
college counseling clone in the high schools. At Obeilin some students 
with Otis I. (). equivalents ol less than too manage* to graduate, hut 
H artson found that (ir, percent of the entering freshmen who were below 
110 failed academically. In another college of approximately the san e 
academic hut lower social standing, it was found that there were practi¬ 
cally no lreshmen with I. Q.’s of less than 110, indicating that the lattei 
institution was admitting students on a more selective academic basis. 
Attrition data also showed a higher mortality rate among the lower 
intelligence levels at the latter college. Ohviouslv, the former institution 
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would be a better choice for a student with an I. Q. equivalent ol about 
no. At Franklin and Marshall the mean I. Q. was in, but here also 65 
percent of those below 110 tailed (512). 

Despite the relationship between intelligence and educational achieve¬ 
ment revealed by data such as the above, the correlation between intelli¬ 
gence tests and grades is not especially high. The numerous summaries 
ot the subject show that in high school they tend to range from .30 to .80, 
and in college from .20 to .70, the modal r’s being .40 and .50 in the 
former and between .go and .50 in the latter. The relationship in college 
seems lower than in high school because the selection procedures in col¬ 
leges cut down the 1 ange of ability in their populations, and this in turn 
makes the con elation coefficients shrink aitihcially. The relationships 
are high enough to make them useful in studying groups, but the margin 
of error when working with individual students is so great as to make 
considerable caution necessary in test interpretation and to require that 
the counselor or admissions obiter gi\e considerable weight to other in¬ 
dices such as high school maths, family educational achievement (as an 
indicator of what his intimate social group expects of him), personality 
adjustment and motivation. None of these, taken by itself, is any more 
\alid than the score ol a good intelligence test for predicting college 
marks, but, taken together, they weld a better prediction than any single 
index (ybfuisg). To cite the* Obcrlin studies once more, the fact that 65 
percent of the freshmen who weie admitted with I. Q.’s ol less than 110 
failed academically is a legitimate reason for questioning the choice of that 
college with an I. Q. of 1 m; on the other hand it should be remembered 
that 35 percent of such students graduated. The counselor must ask him¬ 
self, and get the student to ask himself, what reasons there are for ex¬ 
pecting him to be in one group rather than in the other, and whether 
or not a less competiti\e situation might not be more conducive to his 
1 idlest all-round growth. 

The relationship of intelligence test scores to educational achievement 
has been demonstrated in one other type of study, in which a genetic 
approach has related intelligence to amount of education obtained. These 
studies make it clear that, on the whole, those who arc most able obtain 
the most education. Proctor ft) 13) made a foffow-up study in 1930 of per¬ 
sons who had been tested while in school in 19r 7, and found that those 
who, in 1930, had gone no further than the 9th grade had an average I. Q. 
in 1917 of 105, whereas those who had graduated from high school had a 
mean of 1 11 and those who went to college had averaged 116. This should 
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not, of couise, be taken as jwoof that students who have the abilit\ to do 
college work manage* to go to college: the Pennsylvania Study (p-,8) deni 
onstrated that the bright students who actually get to college are matched 
by an equally able but economically less loitunate gioup who do not 
obtain that much education. What Proctor demonstrated is that those who 
get more education ate, on the whole, more* able than the much larger 
group who obtain less education. 

Tcnnan's long-term studies of gifted persons, begun in 1922 and re¬ 
ported in two follow-ups (S21, 823), provide some more data which dem¬ 
onstrate the importance of intelligence in completing an education. Al¬ 
most all ol his group ot 1300 children with I. (Vs of more than i.jo grad 
uated from college (helped, he it said, bs the lact that thes li\ed in a 
state* which pros ides mote low-cost higher education than any other for 
its residents). 

The studies mentioned so far ha\e all dealt with the relationship be¬ 
tween intelligence and educational achievement, none* with the role of 
the former in sntisftu turn in one’s studies. It is generally assumed that the 
placement ol a student at the proper educational level, one* on which he 
can compete with his juris without undue stiaiu and on which he will 
be challenged in the need toexeit himself in order to master the* subject- 
matter, results in better adjustment and greater satisfaction on bis part, 
(he* assumption seems reasonable. Escrv cxjurienced teachei can cite 
instances in its support. I he* literature ol clinical ps\cliolog\ abounds in 
rcleierices to cases illustiating it (i.jo, 9S7). But oddh enough time an¬ 
no studies imohing objcctise measuies and careiullv cjuantified data to 
jjiosc the* \alidity ol the assuirrption. In one investigation Berdie (78) 
c 01 related intelligent c test scores and measured satisfaction in the study 
of engineering, finding an r of 02. This is disappointing, but is probable 
mem* a delect in the* experiment than in the h\ ]>othesis: the scale used for 
the measmement of satisfaction ma\ not ha\e been sensitise to what it 
attempted to assess, or tlu- relationship mas he* such that it would manl¬ 
iest itself in a study of mans curricula without being rescaled in a studs 
of one tsj)e of curriculum. l ire latter, it will he seen, is true of the* rela¬ 
tionship between intelligence and success in occupations. Although one is 
justified in general!} being skeptical of clinical experience and profes¬ 
sional opinion unsupjjoried by experimental esidence, this svould seem 
to he one instance in which it is best, pending the carrying out of ade¬ 
quate objective studies, to accept the esidence ol snbjectiselv analyzed 
experience. This would lead one to conclude that students svbo are placed 
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in courses which are difficult enough to make them woi k hut not so dif¬ 
ficult as to discourage them are most Jikeh to he satisfied with and in¬ 
terested in their studies. 

Intelligence and Vocational Success 

As intelligence lias been supposed to affect vocational success in a num¬ 
ber of different wavs, tests ha\c been cotielated with a vaiiety of criteria. 
These indude wisdom of vocational choiie. success in tiaining, ability to 
secure a job of a paiticufai tvpe, adjustment in the wot Id ol woik as 
shown bv placement on the occupational ladder, status in the occupation 
as indicated bv criteria ranging horn tenure to earnings, and sat id act ion 
in one’s work. Each of these will be discussed in the following paiagraplis. 

Vocational Choice. In a number of studies (305, 71*8, ej p{) the- moie in¬ 
telligent incli\iduals ha\e been found to ha\e mote* appiopiiate occupa¬ 
tional objectives. This is what one would expect on a prion grounds, not 
only because the more able should have better insight into theii own 
abilities and into job reejuirements, but also because*, in a society which 
encourages people to aspire to the higher levels, they have mote of the* 
abilities which are recjuiied for success in the piestige occupations. I be 
factors consideied in these studies have usualb been limited m numbei. 
Sparling (728), for example, compaied the tested intelligence- of the stu¬ 
dent with the intelligence considered nc*ce*ssarv foi success in his chosen 
held on the basis of an analysis of intelligence test data gallic ic’d fiom 
soldiets in AVorJel War I, while W'renn (ppp compaied the- c011 esponcl- 
ence between measured and self-estimated inteic-sts at dillcient intelli¬ 
gence levels. Atomistic as thev aie in then approach to these pioblems, 
the investigations justify one- in concluding that the* mote intelligent ate 
more likely, other things being ecjual. to make wise vocational choices. 

Success in 'Training. This topic has been dealt with under tlu- heading 
of intelligence* and educational success, as most foimal training is under 
educational auspices and beats an educational label. But, since* one* can¬ 
not succeed in medicine 01 living without first succeeding in medical or 
flying school, success in tiaining is the* fust ste*p in vocational success. It 
is frequc-ntly much easier to obtain ciitetia of success in tiaining than in 
the practice or pursuit of the vocation itself. E01 these itasons training 
success is a commonly used ciiterion of vocational success, and needs to be 
mentioned in this section. 

Securing employment. Studies of the relationship between intellige nce 
and ability to secure employment have bt*en made in depression years, as 
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those are the times when attention is focussed on the problem of what 
it takes to obtain a job and on the differences between employed and 
unemployed workers. 

Few of the youth studies of the 1930's used measuies of intelligence, 
presumably because they were large scale surveys in which accurate test¬ 
ing was impractical. In several studies confined to more accessible sub¬ 
jects, however, testing was carried out with what at first appear to be 
surprising results. Dearborn and Rothney (197) analyzed the relationship 
between tested intelligence and success in securing employment in a large 
sample of youth who were subjects of the Harvard Growth Study and 
lived in communities adjacent to Cambridge, Massachusetts. Thev found 
no relationship. Lazarsfcld and Gaudet ( 157 ) studied a small but carefully 
mate lic il sample of youth in Essex County, New Jersey. They also re¬ 
ported no relationship between tested intelligence and success in finding 
employ menl. 

In contrast to these and similar studies of young persons stand the in¬ 
vestigations dealing with adults in the depre ssion. Mot ton (543) and 
Patei son and Darlev in then summary of the psychological work of the 
Minnesota Kmplov merit Stabili/ation Reseat ch Institute (589) reported 
that, in a variety ot occupational groups in Montreal and in Minneapolis, 
the eat ly unemploved wete less able 1 titan those who wete leleased latei 
in the Depression. At least in retaining theii jobs, then, the more intelli¬ 
gent fate better than the- less intelligent. I his suggests that in emplovmg 
voting people the average business man either does not have access to ot 
does not utilize data tevraling the abilities of the employment applicants, 
but iclies instead on other and, as Dearborn and Rothnev showed, less 
relevant indices, wheieas the employer who is consider irrg releasing cm- 
plovees does depend mote- on indues ol ability. In the cast 1 of a worker 
alteadv in his cmplov this need not be, and generally is not, an intelli¬ 
gence test, but is simply the 1 employer's judgment of the relative value 
to the company (efliciencv, versatilitv, etc.) of each of the 1 pel sons in ques¬ 
tion. No such ability data, which frequently correlate with intelligence 
test scores (see below), are available to the cmplovcr of relatively inex¬ 
perienced youth, although school experience should be such as to provide 
employets with data of the same type and intelligence tests can be used 
in selection. Personnel men should be able to make considerable improve¬ 
ment in their wot k by bringing their prac tices in employing new workeis 
up to the level of tlreii practices in releasing workers when staffs must be 
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Attainment on the Occupational Scale. In a culture in which material 
success and ability to rise to or maintain a high socio-economic level are 
valued as highly as in ours, the question of the relationship of intelligence 
to attainment on the occupational scale is one of vital importante. If the 
relationship is close, then the ambitions of many persons are unrealistic 
and, if not modified by experience, are doomed to disappointment, 
whereas if the relationship is not close then there is some justification for 
the widespread encouraging of youth to aspire to the higher levels. 

The first large-scale studies of this question were made possible by the 
mass of data accumulated as a result of the use of intelligence tests in the 
Army of the United States in World War I. These were analyzed and pub¬ 
lished in the Memoiis of the National Academy of Sciences (525), and 
were subsequently reworked by Fryer (276) and by Fryer and Sparling 
(278) to make them mote usable in vocational counseling. Similar data for 
World War II, based on a sample of some 90,000 white men, have been 
organized in a similar table by Stewart (758), reproduced on pp. 9(1-97 by 
permission of ()< ( upotions. 

A table such as this is useful in ascertaining approximately the occupa¬ 
tional le\el at which an individual is most likely to be able to compete 
without undue strain and, at the same time, with suilicient challenge to 
make the work inteiesting. To know that a student with a score ol i2r, 
has the general ability to compete with men and women who ha\e been 
successfully engaged in the lower professional and managerial occupa¬ 
tions, but somewhat less than that which characterizes those who have 
made good in the higher level occupations of the same type, is of value. 

But the apparent simplicity of the chait is deceptive because it docs 
not bring out the great overlapping of the various occupational intelli¬ 
gence le\els. A gi\en occupation actually includes within itself a gieat 
variety of levels: a chemist, for example, may supervise loutine tests on 
the one hand or do highly creative experimental icsearch on the other, 
or, more commonly, something in between these two extiernes. This 
means that there are opportunities in most occupations for some persons 
at relatively low levels who are not likely, if their mental ability is appro¬ 
priate to these levels, to rise appreciably in the field, and for others with 
greater ability w r ho should, other things being equal, rise to higher levels. 
Thus some chemists really belong in the highest occupational level in 
Table 4, where the majority aie placed, but others should be in the 
second group of occupations. Other factors which play a part in occupa¬ 
tional success need, of course, to be taken into account, but are not in the 
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chart: lack of motivation may disqualify a person from competing effec¬ 
tively at his appropriate intellectual level, or an unusually effective person¬ 
ality may enable another to compete above that at which he might other¬ 
wise be expected to make an optimal adjustment. 

The overlapping of occupations when classified according* to imelli 
gence is well brought out by Stewart, who reports the median and adja 
cent quartiles for a number of dillcrcnt occupations in the Army sample. 
For example, a man with an ACCT score of i i r } might, in so far as mental 
ability is concerned, be a high-average stock-keeper, average genet a] 
clerk, low-average bookkeeper 02 below-average accountant—all in the 
clerical field, not to mention an average draftsman or a low-avetage ie- 
porter in other fields. Clearly, the extent and natuie of the overlapping 
is so great that, while occupational intelligence levels piovidc a tough 
guide, they must be used as that and cannot be applied in a mechanical 
or arbitrary way. 

Another limitation to the value of World War 11 data is imposed by 
the natuie of the sample. Some occupations were not adequately repre¬ 
sented in the Army. The War having been a total wat, and Selective 
Service having operated according to written ditectives and on the basis 
of studies by the Wat Manpower Commission, we know a good deal lmw 
the occupations reptesented in the Anne were affected bv sampling pio- 
blems. As lawyers bad a type of training which was at a piemium m 
neither war industries nor military service during the early years of the 
war, it seems likely that the drafted lawyers ate fairly representative ol 
the young lawyers of that time. Psychologists, on the other hand, were at 
a piemium in both military and industrial personnel work, and as the 
At my commissioned many who were aged thirty or more, and the Naw 
many who were under thirty, directly from civilian life early in the 
war, it is probable that the drafted psychologists who held the Ph.I). 
degtee at the time of being drafted wcie not really representative el 
young psychologists in mental ability and saeoir jane. It is to Ire hoped 
that a thorough-going study of data obtained during World War II will 
he made, relating occupational intelligence findings to known policies 
of Army, Navy and Selective Service. Stewart did not do this. 

A final possible defect in intelligence test data obtained under militarv 
auspices which must he mentioned is the lact that the testing conditions 
are often not optimal. Many new draftees weie not well oiiented to ps\- 
chological tests; these olten 1 esented the tests as so much mumbo-jumbo. 
Others were negativistic in their attitude toward the Service and vented 
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(){reran 11 

I rafhi Kate ( Ink 
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Postal C le 1 k 
p.ookke e jam.’ Ma- 
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1 ahoratoi v 
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1 e lets j.e ( )j.ei ate.r 
Student Sue lolupv 


Writer 

Student, Cavil 
I rnrineoinjr 
Statisrie al C Ink 
Student, ( hemi- 
e al I noneennr 
Tear l.er 
1 a vs \ et 

Student, Jhtsiness 
or PnLlie Ad- 
mirm nation 
Auditor 

Student, Dentistry 
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e hanu al 1 no- 

neennt' 

Per mine I ( le rk 
Student Me die me 
Che. mist 

Stude nt, 1 le r trie ,d 
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i heii feelings in not co-operating in the testing —often to their regret 
when they found, later, that they needed a higher score in order to qualify 
for officer candidate school (an error many remedied by retaking the test 
and making qualifying scores). Still others, heeding rumors that men who 
made high scores were being assigned to a type of training they did not 
want (for example, to Link trainer instruction when they wanted to be 
aerial gunners), made low scores in order to avoid it. But draftee attitudes 
weie not the only problem. Some were created by “efficientv” minded 
or ioutinc-bound officers who sent men to testing after a night of duty 
in the kitchens or after they had had only a few hours sleep subsequent 
to a long trip by troop train. But this should not lead to the conclusion 
that all military testing was conducted under poor conditions or that the 
results should be entirely disregarded. On the contrary, much of it was 
well done, and many, probably most, of those who took the tests tried to 
do then best. It is easy lot a few dramatic cases to cieate a false impulsion 
in such a situation. 

The trends ie\ealcd bv military studies ha\e been confirmed not only 
in studies abioad by Cattell (151) and Awaji (g j), but also in civilian 
studies made in this country by Scott and Clothier (1)8;,) and Pond (hot)). 
These last aie unfortunately not based on huge nmnbets horn all paits of 
the countiy, but their tendency to ague with each other and with the 
Army data gives one greater confidence in their tiends. Proctor’s study 
(613. () 1 j) is perhaps as good an illustration as any since it is longitudinal 
and covers one community. He tested ir,oo students in 1917-18 and ascer¬ 
tained their occupations thirteen years later. When classified accoiding to 
the occupational levels of tlicir 1930 jobs, the results in Table 5 were 
obtained. 


Tabll j 

iNTi-.Lunr .ncl in mem srnoni and onr.ri-A- 
TIONAI. 1 I VI L ‘III1K1I.LN \ LARS LA'ILK 



Oil ufialinnul Lcid 

Mean I Cl 

I 

Professional 

I i. r > 

]I 

Managerial 

1 08 

III 

Clerical 

10.1 

IV 

Skilled 

99 

V 

Semi-Skilled 

97 


A final approach to the topic of occupational lends which should be* 
mentioned is that in which the minimum intelligence uquired for success 
in the simplest type of employment has been imestigated. Most fre¬ 
quently referred to in this country is the study by Unger and Burr (887). 
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but Dunlop (222) made a similar study in Canada, and Abel (1), Beck- 
ham (58), Charming (153), Lord (481), Fairbanks (244), and others have 
also published on the subject, in the United States. 'Fable (3 lists typical 
occupations in which persons at the lower mental levels have success¬ 
fully been employed alter adequate induction on the job and when there 
were no set ions personality problems to complicate things. 

Taull. 0 

MINIMI 'M MI.N’J AI. AOI.S FOR SIMI'LJ OC(..l I’A / ION'S 

(From I'ngrr and BurrJ 


Mental Aiw 

Occupation 

r > 

Packing, garden work, scrubbing floors, simple- 


washing 

0 veins 

Light factory work, light domestic work 

7 years 

Assembly work, criands, pasting, farm work 

8 \ ears 

Cutting, folding, garment mac hint- opci ation, 


laundry, cooking 

9 years 

Hand sewing, press operation, filing, stock 


work 

1 0 yeai s 

Routine clerical, general housework, ma¬ 


chine operation, electrician's helper, painter 

1 1 years 

Selling, millinery woik, janitorial work 


One advantage in employing mentalh handicapped adults in jobs such 
as the abo\e is that, alter the first pet iod of careful supervision while 
they ate le arning the job, the\ are more likely to lie satisfied with routine 
work and to be dependable emplo\ees than aie other persons whose 
mental ability is such that they can legitimately aspire to more challeng¬ 
ing woi k and are impelled to do so by boredom. 

Status within an lh i upatuni. The multiplicity of occupations and the 
variety of uiteua applicable to them lva\c prevented any systematic 
studv of the importance of intelligence for success within occupations, 
as contrasted with success among occupations or placement on the oc¬ 
cupational scale. But there ha\e been a number of studies of the rela¬ 
tionship between tested intelligence and success in certain specific occupa¬ 
tions. An examination of a few t\picul studies, of tlu ir results, and of the 
reasons lor these results, is important to the user ol intelligence tests in 
counseling and selection. 

Although the occupational level studies have shown that executives 
tend to make relati\el\ high scores on intelligence tests, attempts to 
correlate intelligence and success in executiu* positions met with so 
little success during the 1920’s that they iell into disrepute. One such 
study was published in 192.J by Bingham and Davis (95). Using the anm 
tvpe intelligence test with 102 business exec tithes, the correlation be- 
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tween test score and business success as indicated by a composite rating 
based on information contained in personal history records (salary, in- 
\estments, debts, clubs, theatre attendance, etc.), was —.10. Then con¬ 
clusion that “superiority in intelligence, above a certain minimum (all 
were above the Army median), contributes relatively less to business suc¬ 
cess than does superiority in several non-intellectual traits of person¬ 
ality” has been generally accepted, and since the late ’20’s intelligence 
tests have generally been used only as a rough screening for executive 
positions. However, Thompson (826; see also p. 33G) found a small group 
of superior executives superior to others on the Wondcrlic Personnel 
Test. 

As in the case of executives, so in that of salesmen, the studies ol the 
1 elationship between intelligence and sales ability have yielded negative 
results. Most such studies have not been published, as they have been 
conducted by or for companies interested in their own personnel pro¬ 
blems lather than by investigators with a more general interest. Hut 
Moote (r,gS: Ch. jf>) states that experience with salesmen of tangibles and 
of intangibles has led to an emphasis on work with tests of othei types 
(largely inteiests, personality, and personal history). One typical study 
is reported by An del son in his book on personnel work at Mary's (20). 
After administering the Otis Self-Administering Tost ol Mental Ability to 
500 sales clerks, Anderson found that the distribution ol intelligence 
scores clustered in the 80 to 110 range (75 percent), while 20 pet cent were 
below 1. Q. 80 and 5 percent were above 110. This led to the conclusion 
that intelligence tests were of no value in selecting sales clei ks, a cone lusion 
reiterated bv Anderson’s successor (537: .p>). Actually, this appioach seems 
too gloss to be conclusive; a more refined analysis might, for instance, 
show that rug salesmen are and need to be moi e able than packaged 
food salesmen or git Is who sell perl nines. Hut this would be c lassifu a lion 
ol sales jobs according to level; one would still need to asm tain whether 
the more intelligent tug salesman is mote successful than the less intelli¬ 
gent rug salesman who also is above the critical minimum. Such studies 
have not been published, paitly for 1 casons given, partly because ol the 
difficulty of obtaining enough comparable subjects in any one specialty 
for statistical study. Perhaps they are not worth making, in view of what 
we know of the role oi intelligence in occupations in which personality 
factors are of importance. 

Attempts to predict success in teaching, generally as evidenced in prac¬ 
tice teaching while still a student, by means of intelligence tests have 
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met with the same lack of success as work with executives and salesmen. 
Seagoe ((>88) made a study in which she correlated success in piactice 
teaching, as rated on a specially constructed scale, with scores on a vaiiety 
of tests, including the American Council Psychological Examination lor 
College Freshmen. She found no relationship between measifred intelli¬ 
gence and rated teaching performance, although she did fmd some posi¬ 
tive results in the area of personality. Eat her studies found equally dis¬ 
appointing results with intelligence tests, but two recent co-ordinate 
investigations suggest that the situation may be more complex than this 
Rolle (0.J3) found no significant correlation between A.C.E. scores and 
success in teaching in one- and two-ioom rural schools, whereas Rostket 
reported a substantial relationship in larger schools with 7th and 
St h giade pupils. Appaiently the occupation “teacher” is too broad a 
(ategoty for psychological study. 

Results in work with intelligence tests and clerical employees have been 
somewhat different, even though clerical woikers aie not, on the whole, 
as able intcllec tualh as executives or teachers. Some of the most convinc¬ 
ing studies of this occupational group have been made* b\ Bilh of the 
\etna Eile Insurance Company, in collaboration, at times, with Pond ol 
the* Sc c >\ ill Manuiac luring Company (too). In the earl\ studs Bills tested 
i p; clerical employees at dillerent levels of responsibility, and lound a 
e on elation coellicient ol .1 yj with cliflrc ultv of the job. Two and one-hall 
years later the* correlation was . p for those who were still emploved the 
moie intelligent having left the low grade* jobs, often for advancement 
in tin* company, and the least, able in the higher grade jobs having le*h 
them. .Ae tna classified its office positions in 1 } categories from A. low. to 
If, high. Emplovees weie classified also according to their intelligence test 
scores. The* lesults for a siuc.lv of pog emplovees in 1 ejgy; (hie)) showed 
that a clerical wen kei with a score above* too had twice as good a chance 
ol being promoted to a “lespemsible” position as an emplovee with a 
scoie of less than 80. At the same time thev pointed out that almost a>> 
many emplovees with scores above 100 remain at the lowest levels as rise* 
to the* highest. Another stndv has shown that intelligence is related not 
only to piomotability in clerical work, but also to efficiency in the per- 
lonnance of clerical duties in a single job. llav gave a battery of tests to 
machine bookkeepers at the Pennsylvania Company, a Philadelphia bank 
(gr,8). Tin* operation was a routine, bimanual job; the criterion was pro¬ 
duction, that is, the number of debits and ciedits posted and of balances 
extended in a given amount of time, a criterion which had a reliability 
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of about .80. Hay points out that amount rather than accuracy is to be 
stressed, as inaccurate operators cannot keep their jobs. The correlation 
between amount produced and Otis scores was .56 lor 39 women opera¬ 
tors. This was higher than the coefficient lor any other test in the battery, 
which included the Minnesota Vocational l est for Clerical Workers. 
Army Alpha, and several manual dexterity tests, although some of these* 
had values independent of intelligence. These results are reported to 
have been consistently obtained over a fix e-year pciiod. Tnlot tunatelv 
such studies are rare, and none ate known to the writer which throw 
light on the applicability of the conclusion concerning intelligence and 
this one type of tontine clerical work to other types of routine clerical 
work, although one would assume that success in semiautomatic tasks 
such as filing would, if a criterion were established, correlate even mote 
highly with intelligence than does production in a practicallv automatic 
machine operation task. 

Pint net once wrote: "The lower down the scale ol induslix we go, Un¬ 
less valuable do out piesent intelligence tests appear to be lor the sc lei 
lion oi woikeis" (f>c>{: .jSp). He cited two studies, one b\ Otis (r,7S> with a 
perfoimance test achninistcied to .joo woikeis, many ol them loieign bo-n 
or illiterate, in a silk mill, and a study In Yiteles (900) with motmmen. 
in support ol this statement. Since that time a number ol othei studies 
have been made, in which more adeejuate statistical methods and betiei 
experimental design have been possible, with somewhat diilerent lesuhs. 
Blum and Candee (10 5) administered the Otis Self-Administci ing lest 
to 372 department-store packers and wrappers, while I-'orlano and Kirk¬ 
patrick (208) gave it to 20 radio-tube mounters, the former finding that, 
although there was no relationship between test scores and production 01 
supervisors’ ratings for employees who had been on the job for some- time, 
there was a suggestion of a relationship for new’ male employees, and the 
latter reporting that it was related to success only in the case oi the less 
able learners: the additional increment of intelligence was of no value to 
the superior beginners in learning a routine job. Sartain ((>(>9) reported 
a correlation of .b.j betxvccn refresher course ratings (rcliabilit) .77) and .j(i 
aircraft factory inspectors’ Otis intelligence test scores. Shuman (716, 717) 
administered the Otis to inspectors, engine testers, machine operators, 
job setters, various types of supervisors, and other aircraft engine and 
propellor factory workers, the groups ranging in numbers from 25 to 99 
each. 1 he correlations between Otis scores and supervisors’ ratings (re¬ 
liability .70 to .91) ranged from .39 to .57, depending upon the skill and 
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responsibility required by the job. In view of results such as these, 
Pintncr’s conclusions from the earlier studies no longer seem correct 
Instead, the following conclusions concerning intelligence and success 
within an occupation seem warranted. 

1. People tend, in so lar as circumstances permit, to gravitate toward 
"jobs in which they have ability to compete successfully with others 

ii. Given intelligence above the minimum required lor learning the 
oc( upation, be it executive work, teaching, packing, or light assemblv 
work, additional increments of intelligence appear to have no spe¬ 
cial elfec t on an individual's success in that occupation. This con 
elusion ma\ be subject to revision as better criteria ol success arc 
developed, and may not apply to more strictly intellectual jobs such 
as those m leseauh or to some hinds of teadung . but onlv to those in 
which personality and interest are peculiarly important. 

3. In routine occupations requiring speed and accuracy, whether cleri¬ 
cal or semiskilled factory jobs, intelligence as measured by an alert¬ 
ness rather than a power test is related to success in the learning 
period and, in some vocations, after the initial adjustments are 
made. 

It should be noted that nothing has been reported on intelligence' and 
success in the higher professions, in skilled trades, nor in unskilled 
occupations. "1 his is because no research on these problems has been 
located by the writer. It seems likely that a positive relationship would 
be found in the first two, and none in the last, but this is still an unverified 
h\ pothc-sis. 

Job Satisfai tion. It has long been assumed that, even though a person 
might be able to do the work required bv a job in which most of the work¬ 
ers are more able than he, the strain involved in keeping up with the 1 
competition would he* such as to produce dissatisfaction in the worker. 
It has similaih been widely held that ability considerably in excess o! 
that required bv a job causes dissatisfaction because of lack of challenge' 
and consequent loss of inteiest in the work. There is considerable clinical 
evidence to this effect, concerning both educational and vocational 
activities. ITuette and Fryer (615) analyzed a number of case studies, 
confirming these beliefs lor employed persons. Scott, Clothier, and 
Mathewson (GtSy.pi j) present charts showing the relationship between 
amount of school retardation upon leaving school (as a rough index of 
intelligence) and desire to change jobs in employees engaged in several 
cliffetent tvpes of work in one company. For 52 men employed in a repeti- 
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live, monotonous inspection job the curve indicating percent desiring a 
change of job increased sharply with intelligence; in the simple but 
physically demanding foundry jobs the curve was bell shaped, the peak 
for the 42 men in question being at two or three years of retardation, 
with those more retarded or less retarded more likely to be satisfied; 
while in the assembly department, which offered a variety of somewhat 
more complex work, the curve for 8(i men decreased with intelligence, 
for in this situation the abler men had more opportunity to use their 
ability and the less able felt the strain of difficult woik. Anderson 
(20:88-89) reported similar results in a study of labor tin novel in the 
packing department at R. 11 . Macv’s, where the brighter employees were 
found to leave their jobs sooner than the duller, seeking better outlets 
for their abilities. 

It is interesting to note that the studies referred to above were all made 
in the 1920’s, when attention was focused on the use of the then new 
intelligence tests in personnel work. Although such tests are still widely, 
and more discriminating^, used lor placement in business and industiv, 
newer studies of the relationship between intelligence and job satisfaction 
do not appear in print. This may probably be taken as an indication of 
the widespread acceptance of the relationship, but it is also clue to the 
increased recognition of the fact that intelligence is onlv one among 
many complex factors in job satisfaction. It would seem desirable, how¬ 
ever, to supplement occupational norms of intelligence such as those 
compiled from Army data, which show” the relationship between intelli¬ 
gence and usual occupation, with data on the relationship between intel¬ 
ligence and satisfaction in each occupation. This would make possible 
the establishment of more adequate critical scores than would otherwise 
be possible. Guidance and placement in terms of prospects ol being aide 
to compete with satisfaction as well as in terms ol being able to hold a 
job has been show 7 n to result in less instability; clinical evidence suggests 
that it also results in less irritability, aggression, scit-iecrinnnation, and 
escape into fantasy. 

Spi-.cific Tests 

The Psychological Corporation’s catalogue of tests recently listed 22 
group tests of intelligence, most of them suitable lor use at the adolescent 
and adult levels. Even this partial list is obviously too long lor adequate 
consideration in a volume such as this. Annotated catalogues are available 
from publishers and distributors ol tests, and brief critical reviews of 
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current tests appear periodically in the Mental Measurements Yearbook 
(126). There is a need, however. Lor a systematic review of the research 
which has been carried on with some of the widely used and more prom¬ 
ising tests of intelligence, in order to provide the user with a clear picture 
of what has been done with these tests and with an understanding of 
their demonstrated values and limitations in vocational guidance and 
personnel work. It is only upon such a foundation that tests can be used 
with maximum effectiveness and with minimum error. In attempting to 
meet this need the writers task is simplified by the fact that there au 
relatively lew up-to-date tests of intelligence which have been widely 
used in vocational guidance and selection, the statistically analyzed re¬ 
sults of which are to be found in the professional journals. Even so, it 
seems wise to select a lew representative tests and to treat them thoroughly 
rather than to cover all those which deser\e to be included. In this way 
space may be conserved and the repetition of similar findings for test 
alter test avoided. A lew other tests are discussed mote briefly and others 
are merch named. Thoiough covet age of a few representative instruments 
should provide the user of tests with insight into the nature and usefulness 
of the types ol tests in question, and enable him to make his own evalua¬ 
tion ol other tests in which he happens to be especially interested. The 
selection ol tests included in this or any other chapter, then, should be 
taken simply as an indication that they have been used in enough investi¬ 
gations lor some facts concerning them to have accumulated, and as 
evidence of the author’s preferences, rather than as a sign that these 
particular tests are necessarily intrinsically superior to certain others 
which ate not treated in detail. In deciding to use some other test, one 
should summarize all relevant data in a manner comparable to that erf 
this book. 

The intelligence tests now used, whether individual or group, fall into 
three categories, which might be characterized as old type, new type, and 
lactoriul tests. A brief discussion of these tvpes should provide a useful 
orientation to the tests which are to be discussed in the following pages. 

Old type tests of intelligence consist of a variety of items arranged 
either in the spiral omnibus form or according to type with a time limit 
for each type, and yield only a total score or I. Q. The Stanford-Binet is 
an individual test of this type; the Ohio State University Psychological 
Examination , the Ilenmon-Xclson Test of Mental Ability, the Pressey 
Classification and Vn ifuahon Tests, the Terman-McX emar Test of Men¬ 
tal Ability , the Pint tier Crucial Ability Tests , the various Otis 'Pests, die 
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Wonderlic Personnel Test, the Army Alpha Test, and the Army General 
Classification Test are group tests of the old type. Although it is possible 
to analyze some of these tests in such a way as to obtain more refined 
estimates of the mental abilities of the persons tested, the tests were not 
designed for this purpose and they have no norms for the interpretation of 
such scores. To point this out is not to deny the value of the overall score 
provided by any of these tests. Ol these, appreciable amounts of vocational 
validation data are available only for Army Alpha, the Army General 
Classification Test, the Pressey, and the Otis and Wonderlic Tests. 

New type tests include the same general type ol items, but they are 
either ananged according to type in the test blank or reananged in this 
wav in the scoring process. T hese grouped items provide a total scene, as in 
the old type tests, but also part scores based on the type of item. These 
pai t scores aie generally verbal or linguistic and performance or quantita¬ 
tive. I he Wechsler-Bellevue Intelligent e Stale is an indi\idual test of this 
type; the American Council on Education Psychologn al Examinations 
and the California Mental Matiuity Tests are group tests embodying the 
same features. Norms arc provided for linguistic and quantitative parts 
with the objecti\e of making it possible to study the special mental 
abilities of the subject and to pi edict success in verbal or academic sub¬ 
jects, on the one hand, and quantitative or technical subjects on the 
other. Diflerential occupational predictions were expected to be made 
possible by this t\pe of special, as opposed to genctal, mental abilitv 
score. A numbet of studies ha\e been made of diffciential educational 
prediction on the basis of the ATT. with conflicting results; these- will be¬ 
taken it]) in connection with this test. Occupational evidence is still 
practically not available, the California and Wechsler tests still being 
relatively new and the A.C.L. having been used largeb in educational 
programs. 

Eacttnial tests of intelligence are still in an experimental stage, although 
the new type tests just described aie based on the factm analvsis work 
which preceded the development of factorial tests. The subtests which 
constitute a test of this type are included because they arc* heavih satu¬ 
rated with statistically isolated factors which seem to be fundamental 
components of intelligence. Although, in combination, they measure 
what is commonly called general intelligence, factorial studies have shown 
that they are relatively independent of each other and unitary in natine. 
Scores based on these subtests aie therefore used as indices of special, or 
primary, mental abilities. T hese aie not as coarse as verbal or quantitative 
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ability, which factor analyses have shown to be constellations ol abilities 
rather than unitary traits, but are more refined and include such verbal 
aptitudes as word fluency and verbal comprehension, and such quantita¬ 
tive aptitudes as spatial visualization and number facility. The only 
published tests of this type are the Thurstone Tests of Primary Mental 
Abilities. These will be discussed later as a promising technique still in 
the experimental stages; they cannot yet be said to have been proved 
uselul. 

Two group tests ol intelligence will be taken up in some detail in 
tounding out this chapter, and briefer discussions of three other tests will 
follow. The two treated at length are the Otis Self-Administering Test 
of Mental Ability and its derivatives, and the American Council on Edu- 
(at ion Psychological Examination for College freshmen. The three to 
which less space is given are the Army General Classification Test , the 
Thurstone Tests of Primary Mental Abilities , and the ]Vechsler-Bellevue 
Intelligence Scale. 

The Otis Self-Administering Jests of Mental Ability (World Book Co., 

The Otis Sell-Administciing l est was designed for use with senior 
high school and college students, and with adults. Another form is suit¬ 
able (or elemental \ and junioi high sc ho*-I students. These base been u- 
\amped b\ Otis for special answer sheet and stencil scoring as the Otb 
Quick-Scoring Lest of Mental Ability, and by Wonderlic as the Personnel 
Test, both essentially the same as the Otis S.A. with impressed scoring 
techniques, improved time limits in the case of the Wonderlic, but less 
adequate norms in each case. All tluee are widclv used; the S.A. tests are 
desc 1 ibed here as there are more data for them than for the other two tests. 

Applicability. The Otis should not be relied upon with older college 
students and superior adults, as it is probably too ease. As Otis’ manual 
indicates, a number of investigations agiee that when high school seniois 
and older persons are tested, it is preferable to use the twenty instead ol 
the thirty minute time limit in order to correct for this weakness. Oldi 1 
(573) has demonstrated, however, that the standing of persons tested with 
a twenty minute time limit should not be compared with that of persons 
tested with the longer limit. 

Contents. There are 75 mixed items arranged in order of diflicultv. 
some verbal, some arithmetical, and others spatial; they involve vocabu¬ 
lary, sentence meaning, proverbs, number series, analogies, etc. A study 
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t)y Hovland and AVonderlic (384) reports that the arrangement of the 
hems is no longer the best possible and that as many as 25 percent of the 
items are correctly answered by 90 percent of a large sample of adults 
(N = 83 °o); for this reason the newer revisions are to be preferred as soon 
as adequate norms become available. Crooks and Ferguson (183) found 
the items less suitable for college students than lor adults , in both validity 
and difficulty level. 

Administration and Scoring. There are no subtests to time, no special 
directions to give during the examination. The time retptired is 20 or 30 
minutes (sec above). Scoring is by means of printed keys, and the score is 
the sum of the right answers. 

Xorms. The norms for the test are based on the distributions of scores 
for about 120,000 persons. Raw scores may be converted into Binet men¬ 
tal ages derived from a combination of Herring-Binet scores and true 
mental ages as calculated from the distribution of raw scores hv age 
groups. This correction o( Otis' data was deemed necessai \ because of the 
sclccii\e nature of the high-school gioups used in standaidi/ing the test. 

Bingham (91:338) has pointed out that Otis’ college median is lower 
than that obtained b\ the College Entrance Examination Board. As Otis’ 
data come from a number of different colleges and represent all (lasses, 
whereas the Board's were obtained from a limited number of fiighb 
selective institutions, Otis’ college norms are more nearly arc mate. That 
they do not err greatly on the easy side is shown by the fact that the av¬ 
erage present clay freshman makes an Otis I. Q. equivalent on the A CT. 
Psychological Examination of 109. Otis’ median college student I. (). of 
111 is equivalent to the 57th college freshman percentile on the A.C.E. 
norms, probably a little lower than it would be* if the lower-ranking 
freshmen had been eliminated. Differences between colleges ate great, so 
that local norms should be used in both counseling and selection: Otis’ 
manual reports median T. Q.’s for twenty-one colleges which range from 
95 1 ° 

Factors Influencing Scores. Baxter (52) administered the Otis to 48 
tollegc students and found that time and work-limit scores had an inter- 
correlation of .85, demonstrating that at that age level a speed score 
measures the quality of the work the subject can do. Evidence has also 
been reported (99) indicating that college students who read poorly, as 
shown by the Iowa Silent Reading Test, are not underrated by the Otis 
Test; this was ascertained by comparing their Olis scores with their 
Army Beta (non-verbal) lest scores, a comparison which is vitiated by 
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the important common speed factor. Scores have a very low negative 
correlation (-.03 to -.30) with age in adulthood (459). 

Standardization and Initial Validation. Otis’ manual gives unusually 
complete and detailed information concerning standardization and in¬ 
itial, hut little on subsequent, validation. Many of the items in the tests 
were taken from existing instruments. Preliminary editions were tried out 
on high-school groups of about 1000 each. Items were retained il they 
distinguished dearly between superior young students and inferior older 
students in a given grade; the criterion of validity was therefore tapidity 
of school progress. This suggests that the test, being academically stand¬ 
ardized, might not be a vers \alicl one lor non-academic purposes. Onlv 
occupational wdidation, and studies such as Hot land and Wontlcrlic’s, 
can ptovide the answer. Ihe age and grade norms are based on large 
samples from various sections of the United States, not a random nor a 
stratified sample, but one large and varied enough so that to assume its 
adequacy seems sound: the number for grade 0, for instance, is 15,715; 
lor grade 12 it is 24,72 j; lor college students, 2516 from 21 colleges. '1'hese 
nouns are those provided since publication of the test, utilizing addi¬ 
tional data supplied by other investigators. Sti ictlv adult norms ha\e not 
been published, despite widespread use at that age 1 c\el. 

Reliability. Tot ms A and P> have an inter correlation of .92. Reported 
reliability coefficients range from .90 to .97 (171) with the 20-minute, and 
of .8(i with the* 50-minute, limit with adults (577). 

Validity. Otis suggested in his manual that the method of standardi¬ 
zation is the* best indication of validity in an intelligence test. 'This has 
already been described. lie also attempted validation through correla¬ 
tion with various criteria such as tests and grades. 

Con elation .s with grades in several high schools were .55, .57, and .5c). 
the numbers ranging from 157 to 249. Segcl (701) summarized six studies 
with nine coefficients ranging from .20 to .43 and a median of .38, while 
Ilartson (348,319) found correlations with scholarship of .39, in high 
school and of .56 to .58 in college. Miller (532) found a correlation of 
.(>9 with high school grades, and one junior high school study (883) re¬ 
ported that the Otis test was the most useful of five tried. The test clearly 
has a substantial relationship with educational achievement, one which 
varies as one might expect with the practices of the school, the marking 
systems of the teachers, and the range of ability and attitudes in pupils. 

Correlations with other tests are as follows: the Otis had correlations 
of more than .70 with Army Alpha, the CAVD, and other tests (577): 
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Otis-Terman Group and Otis-liinet coefficients equal about .33 and .30 
(332,577), the Otis I. Q. being 6 to 8 points lower than that obtained on 
the 191G Stanford-Binct (150,855). Otis I. Q.’s tend to dilfer bom Bind 
1 . Q.’s, especially at the higher extreme. These results arc typical for this 
type of test; the Tennan Gioup Test being anchored to the Binct is most 
like it, while the Otis, standardized without this base, is more closely 
correlated with group tests such as Army Alpha. It is generally agreed that 
the use of the term 1 . O. lor convened Otis scores is not sti icily justihed. 
Otis pointed this out in his manual, but used the team I. O. because it is 
the standard method ol measuiing biiglitness. cautioning useis of the 
test always to spec ih “Otis I. ().” Despite the statistical impossibilitv ol 
an adult I. Q., the* chronological age factor in the latio of MA to C\ 
having ce ased to change alter mid-adolescence, it is olten comenient lor 
test useis to think in terms of I. Q. equivalents. 

Correlation with Success on the Job. This topic has been dealt with at 
some length earlier in this chapter in connection with intelligence as 
measured In various tests. A substantial number of the studies rein red 
to in that section involved the use of the 1 Otis Sell-Administering lest, 
which together with Army Alpha in its various revisions was piobablv 
the most widely used test in business and indust] v dining the icg'o’s 
and icjgo's. especially at the clerical, skilled, and semiskilled levels. In 
this section, therefore, only specific findings which may aid in the under¬ 
standing and use of the Otis test will be mentioned. 

H ay and his associates have used the Otis in selecting bank c lei ks and 
calculating machine operators over a number of years at the Pennsyl¬ 
vania Company (359). There it has been found desirable to use g(> as a 
critical minimum raw score for cleiical workers, with a 20-minuie time' 
limit; this is equal to a go-minute raw seme of .jf>. and an I. (). ol ioj. 
When Otis scenes wcie conelated with the production of machine book¬ 
keepers (558) it yielded a coefhcient of .56 (X equaled 39). figuie which 
was sustained by subsequent experience. 

Shuman (716,717) has repotted studies dealing with success in skilled 
employment. He studied supervisors and skilled workers such as tool- 
maker learners and job setters, correlating Otis score’s with ratings by 
supervisors. The ratings had a reliability ranging fioin .70 to .91; the 
validity coefficients ranged from .39 to .37, increasing with the skill and 
degree of supervision exercised in tlu* job. Critical scores were established 
lor each supervisory job, the minimum ranging from a raw score of go to 
one of gg lor foremen on the Otis Quick-Scoring, the I. Q. equivalent lx*- 
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ing 88 lo 91, whereas that for inspectors in the same plants was 51 (I. O. 
equals 109). Shuman calculated that the use of the Otis test would have 
improved the selection of excellent skilled and supervisory workers by 
from 15 to 20 pet cent. Sartain (G71) correlated Otis S.A. scores of groups 
of 40 foremen and 85 assistant foremen with supervisors’ ratings, the re¬ 
liability of which was as high as .79 or as low as .48 depending upon the 
comparison. The \aliciity coefficients were .04 and .16; other tests were 
no better. 

Studies ol semiskilled jobs have been somewhat more numerous. F01- 
lano and Kirkpatrick (2G8) analyzed the Otis test scores of 20 radio tube 
mounters, whose work requires considerable finger and hand dexterity. 
Each worker was a new employee, tested upon application for work; each 
was rated “good” or “fair” by a supervisor alter one month of employ¬ 
ment. 'There were as many fair as good employees among the group 
making abo\e average or average scores on the Otis (I. Q. 95 or abo\ei, 
but six out ol the seven employees who made below average scotes (I. O 
91 01 less) on the Otis were considered fair and only one was considered 
good. As the tatings weie based on the induction and learning petiod, 
this suggests that, in semiskilled work, having more than the critical mini¬ 
mum ol intelligence is desiiable for rapid adjustment to the job, but that 
additional increments of ability are of little value. It will be remembered 
1 hat this was the sole positive finding of Blum and Candee (105, iofi) in 
their study of the 1 ole of intelligence in another semiskilled job, packing 
and wrapping; here there was no relationship between Otis scotes and 
production or supervisors’ ratings for regular employees (those who had 
passed the learning period) and no relationship between intelligence and 
pi oil uc tion in seasonal employees (whose brief employment period make* 
them leat net s for most of their period of employment), but the super¬ 
visors* tatings of the latter group did show a slight tendency for the supe- 
1 ior male workers to be more intelligent than the inferior male woikers. 
The authors suggest that the failure to find a similar tendency among the 
women seasonal workers may be due to rating on a different basis. In a 
project of the Office of Scientific Research and Development, Salter (673) 
found no relationship between Otis scores and submarine officers’ ratings 
of enlisted men’s performance. 

The Wondcrlic Personnel Test, a revision of the Otis, was administered 
to 7G9 applicants for ordnance factory work, together with other tests, b\ 
Me Murry and Johnson (900). The criterion of success in this study was 
supervisors’ ratings of 587 employees still working when the follow-up 
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was made. Although some of the tests did have rather high validity for 
some jobs, there were no significant correlations between intelligence and 
any of the ratings. Tiffin and Greenly (846) administered the Otis to 
women electrical fixture and radio assembly workers, with similar results. 
Although other scores on tests were positively correlated with production, 
there was no relationship (.23 dt .11) between intelligence and produc¬ 
tion. As there was no analysis of the relationship during the learning 
period, it is impossible to draw any conclusions concerning the role of 
intelligence during induction into the job, but it is clear that, in these 
and in many other semiskilled jobs, intelligence is unrelated to success 
once the worker has made the initial adjustments. 

Success in skilled and semiskilled jobs has been correlated with Otis 
scores in training situations by Paterson and associates (588) and by Sar- 
tain (fificj). The former worked with junior high school boys, using a sati¬ 
ety of critetia, several of which were occupational lather than educational 
in nature. The Otis was administered, together with a variety of other 
tests, to 217 seventh and eighth grade bo\s, and correlated with instruc¬ 
tors’ ratings of the quality ol work done in ptoducing stanclatd sample's 
or projects in mechanical drawing and sheetmetal cmuses, and with an 
overall rating of the quality ol their shop operations (X equalled too in 
this instance). These ratings were shown to ha\e reliabilities ol .87, .r,f), 
and .88, using the odd-even technique, and .93, .72, and .81 corrected by 
the Spearman-Brown formula. The correlations with Otis scenes were, 
respectively, .25, .16 to .19, and .21; although not high e nough for use in 
counseling individuals, their relationships were statistic ally significant 
and indicate that intelligence plays some part in shop operations. 

Sartain’s study (fifk)), unlike Paterson’s, used adult subjects in an in¬ 
dustrial situation, but unfortunately his criterion was mote educational 
than vocational in nature and a number of impoitant details ate not 
supplied. He gave the Otis and other tests to 4P1 employee's ol the inspec¬ 
tion department of an ait craft factory who weie taking a refieshci course 
for inspectors. The sex and age of the employees are not described, al¬ 
though it is stated that many had considerable experience and some were 
relatively new in the department. No information is provided as to the 
type of inspection work done: failure of a given test to predict success in 
inspecting engine assemblies would, for example, mean something quite 
different from failure of a test to predict success in inspecting fuselages. 
The two instructors rated each employee independently, their agreement 
being indicated by the unusually high correlation of .77; when the sub- 
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sequent merit ratings ol 20 of these employees who were on the job a 
year later were averaged, the correlations he ween instruc tors’ ratings and 
merit ratings wars ..p>. This suggests that the immediate criterion wars not 
only fairly reliable, but also telated to job success, even though based 
on performance in a refresher course rather than 011 the job. The corre¬ 
lation between Otis scores and instructors’ ratings was .bj, higher than 
that for any other test except the MacQuaitie 'Lest lor Mechanical Abil¬ 
ity; other mechanical aptitude tests yielded cocfficients of from .2 j to ..jy. 
In another study of three groups of foremen (N=^o, 53, 85) the criterion 
was supervisors’ ratings (reliabilitv^.ycy) but the validity of the Otis 
was only .o.j to .if>. 

Diflerentiation between Occupations. Despite the widespread use 
with employed adults, no studies of intellectual differences between oc¬ 
cupations have been made with the Otis test. Shuman’s study (717) estab¬ 
lished critical scores for certain jobs in one company, but these are of 
limited applicability. Presumably occupational diflerentiation has been 
so well established with other tests, from which conclusions mav be 
drawn for the Otis, that it has not seemed worth while* to make such in¬ 
vestigations. It would certainly be impractical to ttv to improve upon 
the sampling of the Army testing in both W01 Id Wars, defective though 
it is irr some respects. 

Job Satisfac turn. No studies have been located in which Otis ^emes 
have been related to satisfaction either in the- current job or in the usual 
occupation. 'The ge neral paucitv of work on this topic has air each neen 
disc ussed. 

I'se of the Otis Tests m Counseling and Selection. The evidence con¬ 
cerning the use of the Otis tests in educational counseling and selection 
clear 1) points ter the conclusion that it is erf value in estimating a given 
student’s prospects erf success in school or college. Although nianv other 
factors need to be taken into account, and although the 1 elationship be¬ 
tween Otis scores and grades vaiies from school to school and from college 
to college, an individual's peilormance on such a test is one lactor which 
should be known by that individual and by the counselor or admissions 
officer. 

Concerning its value in vocational guidance and se lection, the evidence 
is not so clear. But this is only to be expected, in view of the greater com 
plexity of the occupational world and of the greater variety of demands 
made upon the woiker bv the vat ions jobs in which he might engage. 
Despite this fact, it has pioved possible to establish critical minimum 
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Otis scores lor employment in clerical, in skilled, and in semiskilled jobs, 
below which a disproportionately large number of workers fail and above 
which a reasonable proportion succeed; research with other tests indicates 
that this could also be clone for executive and professional jobs. 

It has also been demonstrated that, at least in some semiskilled jobs, 
the Otis is \aluable in predicting the speed and ease with which the new 
worker will make his initial adjustments to the job demands. 

Once* the new worker has made the initial adjustment to a routine job, 
the Otis score lias no \alue in predicting success either in terms of pro¬ 
duction or in terms of supersisory judgments. At least one exception to 
this genera] i/at ion is provided by machine bookkeeping, in which the 
work is 1 online but mental rather than manual, and demanding of great 
accuracy. 

No other generalizations concerning the Otis tests and vocational ad¬ 
justment are wan anted by the research. However, ce rtain other general¬ 
izations based on work with other intelligence tests which correlate rea¬ 
sonably well with the Otis are possible. These have been discussed with 
the suppur ting c\ ideme earlier in this chapter. 

Even il the results ol studies ol all intelligence tests and \ocarional 
adjustment are thus taken into account, there is a dearth ol longitudinal 
studies ol their pieclictive value in \ocational guidance as contrasted 
with selection. The \ocational counselor must relv largely upon deduc¬ 
tion and generalization from validation studies in selection programs and 
Irom cross-sectional studies such as those of the Army intelligence test 
data, and upon cautious insights which use a thorough understanding 
of the available* research as a springboard for establishing working hy¬ 
potheses. More will be said on this subject in a later chapter on the in¬ 
terpretation ol test lesults (Chapter 20). 

The American Council on Education Psychological Examination (The 
American Council on Education, yearly) 

Each fall (he American Council on Education publishes a new form 
of its Psychological Examination for College Freshmen, an intelligence 
test used by some 300 colleges and uni\crsiiies. L. L. and T. G. Thurstone 
of the University ol Chicago have been responsible for the technical 
work on the tests, and the constant revision of forms which are used 
each year with thousands of entering college students has resulted in a 
superior series. 

Applicability. Designed lor and standardized on entering college 
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Ireshmen, the test may also be used with high school seniors, but the 
studies by Barnes (13) and Hunter (393) which have been made concern¬ 
ing changes in scores with increasing age have demonstrated a need for 
caution in making comparisons of high school students, older college 
students, or adults with the normative group. In the latter investigation 
S7 of 105 college girls gained an average of 31 percentile points by their 
senior year, 75 percent of this change occurring during the first u at. The 
lact that published norms are in terms of college freshmen has tended to 
limit the use of the tests to that group; no tables arc vet a\ailable to 
make possible the accurate interpretation of scores made by high school 
juniors or by college graduates. 

Contents. Various editions of the test have included fi\c or six sections 
such as sentence completion, artificial language, same-opposites (vocab¬ 
ulary), arithmetic reasoning, analogies (symbols, spatial), and numbci 
series, all grouped more lecently into two parts to gi\e a quantitative 
(arithmetic and spatial) and a linguistic as well as a total score. The items 
are probably less affected by knowledge than those in most group tests, 
1 or the emphasis in selecting items was to choose those which measure 
abilitv to manipulate svmbols ratliei than master) of previously learned 
lads. Thus in the aitilicial language test the subject is gi\en a new vo¬ 
cabulary into which he must make translations, and in the analogies test 
lie must pick out similarities and dilleiences in unfamiliar svmbols and 
lot 111s. As these tests and items have been selected and modified bom 
earliei tests and tried out over a period of neatly twenty years on large 
numbers of subjects, with adequate lunds for necessary research, thev 
constitute an unusually valid and reliable instrument. 

Administration and Scoring. Each subtest is preceded b\ a piadice 
exercise, and both ate closely timed. The test requites about one hour all 
told. Scoring is simple, machine-scoring methods being applied even in 
hand scoring. 

Norms. Norms consist of percentiles for freshmen in liberal at is. 
teacher training, and junior colleges, a type of norm more helpful in the 
guidance of high school seniors planning further education than com¬ 
parison with freshmen in colleges in general. 'The numbers in each group 
tend to be about Go,000, 12,000, and 12,000 respectivelv. It would be 
desirable in counseling concerning the choice of a college to have norms 
for specific institutions, in order to help choose one in which each stm 
dent is most likely to succeed and to be satisfied. Unfortunately, the need 
to “safeguard” the reputation of an institution keeps such data i 10111 be- 
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ing published, although in the long run each college would probably 
gain if it did declare its interest as being in students ol a certain fairly 
broad mental level and in supplying a kind ol education appropriate to 
that level. College admissions officers use local norms lor such tests as the 
A.G.E. in evaluating doubtful candidates lor admission. The Thurstones 
have not supplied I. Q. equivalents because of the artificiality of adult 
mental ages; such equivalents are pro\filed each year by the Educational 
Records Bureau and are helplul in interpreting A.C.E. scoies in terms 
useful in generalizing from college to \ocational competition. 

In the absence ol norms for specific institutions, the next best type 
would be norms for < learlv defined and homogeneous gioups of institu¬ 
tions. The present classification of colleges into lout-year, junior, teachers, 
and technical and piofcssional colleges might seem at first glance to 
provide these, but as Crawford and Burnham (iHcucju-q j) have pointed 
out this is not the case. The loui-}ear liberal arts colleges, lor example'. 
co\er a range of scholastic aptitude which is almost as great as that ol all 
four tvpes of institutions (qo peicent). They are. thercfoie, an extiemcTv 
heterogeneous gioup; while the nouns mav be typical of colleges in gen¬ 
eral, the tange is so great as not to be* ver\ helpful in counseling an indi¬ 
vidual about the choice of a specific institution. Crawford and Burnham 
point out tfi.it the aveiage Vale freshman is at the poth percentile on the* 
general norms, and neailv So peicent of these freshmen exceed the 1 
national 75th percentile. Nouns should be pi o\ filed lot vaiious classes of 
libel al arts colleges, adequately defined. 

Studies of sex differences reveal negligible differences in total score's, 
masculine supeiiority in quantitative paits, and feminine superioiity in 
linguistic parts (840). This checks with data on interests reported bv 
workers with Strong’s interest inventor v. 

Factors Influencing Scenes. Smith (723) has icpoiied finding higher 
scores among urban than among rural students, as have other studies ol 
urban-rural differences. Whether this is primal ily the long-teim result 
ol selective migration or the elleet ol environment and in ban-construc ted 
tests is still a question. Barnes (ja) lound that two \eais of college mathe¬ 
matics had no appreciable effect on the O scoies of an experimental group 
of 40 students, when compared with 75 contr ols who had equal O scores 
as entering freshmen but took no college work in mathematics. 

Stall dm dization and Initial Validation. New forms ol the A.C.E. tests 
are constructed so as to re semble earlier lot ms, although there are dif¬ 
ferences in details and innovations are gradual!} introduced as new 
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typos of items arc tried and adopted. Each new form is thus based on 
extensive previous work which has proved its validity; in addition, it is 
administered for tentative standardization to iooo or more students who 
have also taken the preceding form. The scores of some 60,000 college 
freshmen who take the test each fall provide final norms. Studies have oc¬ 
casionally been made to determine the academic predictive value of the 
examination and to establish its leliability. The assumption is usually 
made, howevei, that since the new edition is anchoied to the preceding 
editions and has similar nouns it will be approximately as reliable and 
valid as the). A icport is published each year in the Anmican Council 
on Education Studies, giving data on the lorm published the picceding 
fall. 

Reliability. The reliability ol the A.C.L. tests has been consistently 
high. One study by the test authors reported odd-c\en reliabilities of .95 
foi the total score, and of .87 and .95 for the Q and L scores respectively, 
for the 1998 college edition (8 jo). Yotaw (904) found a correlation ol .7} 
between Otis scoies in 7th grade and A.C.L. scores six vears later (X — yn). 

Validity. It is generally accepted that one indication of the \alidity 
ol an intelligence test is the caielulness of its standardization. The caie 
used in this sei ies of tests is illustiated bv subtest intcrconelations (or 
the 1998 form which lange from .90 to .hr, with a median of .39 in an 
attempt to measme lelativelv distinct components of intelligence (Hjo). 
The high reliabilitv of part scores mentioned above is another illusti at ion. 
Anolhei illustiation is provided bv the specific college norms nported 
by the authors (8 pi) and by 7 'raxlcr (858) who converted A.C.E. scores 
to L Q. ecjuivalents and ascertained the median L Q.’s o* the freshmen in 
929 colleges. These tanged from 126 in a private libelal arts college to 87 
in one junior college. I he median for liberal arts colleges was about 1 10. 
lor teachers and junior colleges about 107. Schneidler and Berdie (680) 
have leviewed similat data. As has been shown in numerous earlier 
studies, theie is a college for almost every I. Q. level. It is regrettable that 
thev cannot be identified by professional counselors. 

Con elation with Other Tests. The A.C.E. test has frequently been 
conelated with other intelligence tests. W ith the 1916 Binet a correlation 
of .69 (j.|o) has been reported, while for the 1997 Revision it is .58. .(>2 
(16) and .67 (507). With the Otis S.A. Higher Form, coefficients ol .78 
and .82 were found by Traxlcr (86_j). Hilchcth found that the A.C.E. 
gave appioximately the same percentile ranks in the senior year of high 
school as the Binet had given previously to the same children in ele- 



118 APPRAISING VOCATIONAL FITNESS 

mentary school (372). Anderson and others (ifi) reported correlations 
of .48 and .53 between two different forms and the Wcchsler-Bcllevue; 
the two verbal scales are about as closely related (.49, .51), but the per¬ 
formance and quantitative scales have less relationship (.31, .39), a fact 
needing fuithcr investigation to make it clear just what types of con- 
ciete mental ability each of these scales measures. Certainly it would be 
dangerous to interpret Wechslcr-Bellevue performance I. Q. s in terms of 
A.C.E. Q-score validities, or vice-versa. 

The use of performance or quantitative scores in educational and vo¬ 
cational guidance is in any case still largely hypothetical, although in 
some selection programs specific evidence has been collected which makes 
possible the use of part scenes. The writer administered the 1938 college 
edition of the A.C.E. to 123 high school juniors and seniors, togcthei 
with the Nelson-Denny Reading Test, the Minnesota Vocational l est lor 
Clerical Workers, and the Co-operative Survey Test in Mathematics 
(792). The results, shown in Table 7, indicate that A.C.E. linguistic 
scores are more closely related to reading ability than are eithen quanti¬ 
tative or total scores, that linguistic scenes predict achievement in mathe¬ 
matics as well as do quantitative scores, that linguistic scores are more 
(losely telated to name-checking scores than aie quantitative but that 
they are equally 1 elated (or unrelated) to number-checking scores. Tiax- 
ler (863) found i\ ol .2f> between Bennett Mechanical Compichension 
.cores and O scores and of .31 between the same test and L scores. Appar¬ 
ently the latter are a better measure of general ability than the fonner, 
and neither is a superior measuie of special aptitudes. It will be seen 
later that there is some evidence to suppoit the belied that quantitative 
and linguistic scoies have differential predictive value for college comses. 
but the evidence is conflicting, and data such as those just piescnted 
suggest that they arc actually comparable in predictive value 1 except for 
the closer relationship of linguistic scores and reading ability. 

'I’able 7 

RILATIONSIIIP OF A.O.I.. PAR r-SC ORES TO OTHER ABILITIES 
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Bryan (123) and Estes (2jo) have reported correlations of .05 to .36 
and .45 between Q scores and the Minnesota Paper Form Board (Revised). 
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In the former study, the spatial subtests correlated .r,r, with the Paper 
Form Board. This is a lower correlation than is generally found between 
intelligence tests and the Papci Form Board (p. 301), perhaps because of 
the homogeneous population. 

A totally different line of investigation was opened up bv Munroe in 
a study of the relationship between A.G.E. scores and Rorschach indices 
(553)- ^ administered both tests to 80 students at Sarah Lawrence 
College, and ascertained the dillerence between the O and L percentiles 
lor each girl. I hese dillerence scenes were distributed, and the top and 
bottom cjuartiles were selected lor further study. This gave Munroc one 
group of “higher LY’ and one of “higher O’s.” 1 he Rorschach patterns 
of each of these groups were then analyzed and contrasted, with the fol¬ 
lowing conclusions: 

I here were no differences m genci.d adjustment as measured by the Rorschach 
1 nspection Techmejur. 

Jhcie were no differences m the number of lesponses nor in the number of 
wolds 111 the protocols of iiu two groups 

Hie lughn ()\ s gave a significantly huger pel tentage <>t 1 espouses in which form 
w is the determinant 

The inghri ()\ gave significantly more accuiate form responses. 

The higher l ’s gave significantly more movement responses 

The personaliiv picture obtained from the above data is one of a 
subjective, imaginative, higher L svndionie, and ot a more objective, 
liter al. outer -1 ealitv-bound higher O svnehome. 'I he hitter tv pe (it per¬ 
sons at the extreme <>1 a continuum mav be called that) resembles that 
found in paleontologists bv Roc* and described in a later chaptet. 

In pointing this lac t out. Roe also states that the higher (Ts were found to 
choose mole scientific couises than the higher Is. II these findings a.e 
confirmed bv other studies 11 would seem that differences in quantitative 
and linguistic scenes mav he indicative of differences in the utilization of 
intelligence aiising from differences in personality, as well as. or perhaps 
even rather than, dillercnees in primarv mental abilities. Such a ladica. 
conclusion would he compatible with the findings that Q and L scores 
are not differentially related to success in quantitative and linguistic 
subjects, and are related to the choice of one or the other tvpe of curricu¬ 
lum. It would not fit in with contemporarv fac tor tlicorv. 
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Correlation with Grades. The relationship with achievement has been 
most intensively studied, academic: prediction being the purpose ol the 
test. Studies horn various earlier editions yielded validity coefficients 
tanging from .17 to .81 for grade-point averages (284) and from .34 to 
.(io with freshmen marks, and correlations of .43, .43, and .5 j (284,456, 
49b 195 ’ G 3 2 * 7 ° 5 ) with long-term averages of groups of 228, 1,052 and 378 
students. Subsequent studies reported correlations of .39 to .Go with 
grades in various colleges (16,127,195,553), the mode being about .55. 
Modal correlations with first semester grades are about .45 for engineers 
and .50 for art students. For grades over four years the correlations aie 
about .45. 

Wcintraub and Salley (915) found that, at Hunter College, 14 percent 
of the upper half of a freshman class of 106 \ students vveie dropped for 
poor scholarship over the four-vear course, as contrasted with 2 j pci cent 
in the lower half on the A.C.E. The range ol intelligence in this group 
was of course limited. 

At tlie University of Chicago (840) correlations with introductory biol¬ 
ogy marks tanged from .43 to .47; humanities, .46 to .53; physical science, 
.39 to .46; social science, .46 to .51 (N —200 to 2000). Slightly (2 to 6 pts.) 
higher results were reported by Shanncr and Ruder (712). I he cotrelation 
with matks for students of agriculture in another institution was .49; 
engineering, .45; general, .49 (546). This appeals contrary to the sugges¬ 
tion of some that the test should be more valid in liberal arts than in 
other college’s. 

Pat t-sc 01 es have been related to achievement in specific subjects and 
fields bv several investigate)]s. Segel and Gerberich (j) correlated part- 
scores with maths in English, foieign languages, and mathematics, with 
the results shown in the left-hand column ol each pair in Tabic 8. Co¬ 
efficients for variables which should theoretically be highly correlated are 
shown in italics. 

I11 another study bv the same authois (704) pat t scores vveie conelated 

Table 8 


CORRELATION BETWEEN A.C.E. PART-SCORES AND ACHIEVEMENT 
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with Iowa Placement Test scores in the same subjects. The results appear 
in the right hand member of each pair of columns in Table 8. The dis¬ 
crepancies are such as to be suiprising were it not lor tire unreliability 
of marks; nevertheless, patterns ol ability and achievement seem to exist, 
the verbal tests being more closely related to the verbal subjects, the 
quantitative* tests (in one case) to the quantitative subjects. Similar curric¬ 
ular relationships were later found at the University ol Florida (y.jG). 
Work such as this, combined with 1 hurstone’s factor analysis (8.jo) led 
to the use ol () and L stoics in more recent editions. Evidence which 
indicates a need lor caution was published by the writer (792), to the 
ellect that whereas the total scores on the A.C.E. test correlate .fry with 
the Co-operative Survev l est ol Mathematics, both the (.) and L scores 
correlate .yf> with the same test, O scores giving a prediction of achieve¬ 
ment in mathematics in no way superior to that vielded by L scores. On 
the* other hand, while the* total score has a correlation of .fib with the 
Nelson -1 )ennv Reading 'I est, that lor (> is .97 and that lor L is .80, in¬ 
dicating a genuine* chllerencc in O and L scores. Generally similar results 
have been obtained bv loin other investigators using grades as criteria 
( 1 b, j2.yog.7t) j). Mac Ph.ill's smdv (yog) involved analvses ol data at both 
secondaiv and collegiate levels, the latter wcie treated in terms ot both 
curriculum and courses. Representative data horn two of his tables are 
iepioduced in Table 9. 

Tabiu 9 
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Of the courses lor which data are not iepi oduced here, onlv psychology 
among the “quantitative" subjects showed a possibly significant dillcieiuc 
between the coirelations, and that was in favor of the L score; there 
were no significant differences among the other “linguistic” courses. The 
conclusion drawn by MacPhaiJ is that data ol this type must be obtained 
by each institution if it wishes to use Q and L scores for selection and 
guidance; certainly any blanket use of such scores in counseling is now 
unwarranted, and, if one were to generalize from his study (as adequate 
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as any now available), it would be to the effect that L scores are as satis¬ 
factory as Q scores foi predicting success in mathematical and scientific 
courses, and perhaps slightly more satisfactory for predicting achievement 
in some linguistic and verbal courses. 

Estes (240) correlated A.C.E. scores with grades in analytic geometiy 
for 76 engineering freshmen with the following results: r Q and grades— 
.33, r L and grades — . 15. This agrees with MacPhail’s findings. Bryan 
(123) found correlations between A.C.E. scores and art grades varying 
from .02 to .37 for various types of art students (N=ioo8), those for the 
quantitative parts tending to be slightly lower than those for the verbal, 
but the trends are not significant. 

Part scores on tests such as this presumably measure constellations of 
primary abilities, as Thurstone (840) has shown, although Munroc’s ex¬ 
ploratory work on personality relationships (p. 119) raises important ques¬ 
tions. These may be related to achievement in special fields as leported 
by some investigators, but it is obvious that more conclusive evidence 
is needed bcfoie A.C.E. Q and L scores are relied on in difletential pre¬ 
diction or counseling. 

Correlation until Success on the Job. It is to be regretted, in view of 
its excellent construction, widespread use and the extensive information 
on hand concerning it, that the A.C.E. test has not been adequately val¬ 
idated for vocational guidance and selection at the business and pro¬ 
fessional levels. There are practically no validation studies of this test 
using strictly vocational criteria, although seveial studies have shown that 
its total scores are related to success in some types of professional nam¬ 
ing, e.g., engineering (pjj), and, in some institutions, nursing ((>19). 
Seagoc (G88) found that well-ad justed student-teac hei s, and maladjusted 
student-teachers of average or low intelligence in one college, tended to 
remain in tiaining, whereas the bright but maladjusted students dropped 
out—perhaps because they recognized the misfit and saw other more ap¬ 
propriate opportunities. Ratings ol success in practice teaching did not 
correlate significantly with A.C.E. scores. Rolfe (643) found no relation¬ 
ship (r = —.10) between A.C.E. scores and the teaching success of 52 Wis¬ 
consin one- and two-room school teachers, the criterion being tested pupil 
progress. Rostker (G52), however, applying similar techniques to 28 teach¬ 
ers of 375 seventh and eighth grade pupils found a correlation of .57. Per¬ 
haps teaching in larger schools is a more intellectual activity. Rransford 
and others (117) found a correlation of .64 between A.C.E. scores and 
ratings of the administrative effectiveness of 20 civil servants at the top 
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management level. 1 hese findings suggest that intelligence as measured by 
the A.C.E. plays a part in the intellectual aspects ol some vocations, in¬ 
cluding those important in training, but that in other occupations, 
whether in training or in practice, other factors arc more important. 

Differentiation between Occupations. Two studies have found the 
usual relationship between parental occupation and student intelligence. 
Byrns and Hennion (129) found significant differences between adjacent 
occupational levels, except the business and clerical and the skilled and 
semiskilled. Smith’s study (721.’), based on 5 j87 students, found similar 
differences. 

Job Satisfac tion. \o studies with this test have been located which 
bear, directly or indirectly, on job satisfaction although Berdie (78) 
showed that ATT. seoics weie not related to satisfaction with training 
in engineering. 

Use of (he A.C.E. Ps\( holognal Examination in Counseling and Se¬ 
lection. I his re\»ew of (lie A.C.E. Psychological Examination shows that 
it has been studied in most ol the' 'ways in which other tests have been 
trieel. although lately in investigations of \ocational adjustment. There 
is piolubh nioie matciial concerning its educational significance than 
theie is lot any other single test, it is a leliable and valid test of scholastic 
aptitude or general inte 11 igencc* at the college level. The test goes beyond 
this, however, in attempting to bicak down the concept of “general in¬ 
telligence” b\ ]>io\iding pat t-scours for 'what logical and statistical anal¬ 
ysis indicate' ma\ be- special aspects ol intelligence. As Thurstone (Sjo) 
has shown in a lactoi analysis ol the 193S edition, these aspects of intelli¬ 
gence ate* not January abilities or factors, but constellations of related 
lac tors. Ehis breakdown is thus a compromise attempt to take advantage 
of the- findings ol lac tor anahsis and set to jnovide a practical measure 
lor administrarise* and guidance use. Jt is promising because it represents 
a stej) in advance in giouj) testing technique without departing so far 
from jnosed techniques as to make it a purely research instrument, but 
its part-stores aic still of uncertain value in differential diagnosis and 
jiredic tion. 

The freshman college norms arc perhaps the most adequate available. 
It is unfortunate that the same forms are not standardized at other 
educational and age levels, and that its vocational significance is not 
better established. However, the high correlation with other intelligence 
tests, together with the equivalent scores which have been made available, 
make it possible cautiously to use occupational and educational norms 
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established for other tests. It should be remembered in so doing that Otis 
I. Q/s are not the same as Binet I. Q.’s because ol dilleient methods of 
calculation , that I. Q/s are artificial equivalents and not tint* J<itios of 
mental to chronological age at older adolescent and adult lc\cls, and 
that equivalent scores are based on averages and may therefoie be dis¬ 
torted in extreme cases. 

The Army General Classified!mu Test (The Adjutant General’s Office, 
War Department, 1940; Science Research Associates, 19 j7) 

This test was devised by the Adjutant Genetal’s Oflice when Selective* 
Service* was adopted in 19 jo, as a substitute lor the widely used Army 
Alpha ol World War I. The two orginal lot 111s designated bv the Anm 
as AGCT-ia and AGCT-ib were used Iroui October 1940 and April 19 ji, 
respectively, to October 1941. The two final forms, AGGT-ic and 
AGCT-id, were equated with the first two and were used in the testing ol 
all men and women who were inducted into the Army between October 
lqji and April 1945. AGCT-i was administered to a total of well ovei 
9,000,000 persons. It was so widely used that more than 4000 persons 
daily were tested. With the intioduction ol a completely revised classifi¬ 
cation test, based on 11101 e modern piinciples of intelligence test construc¬ 
tion and yielding separate scenes ioi veibal, numeiiial, and spatial 
aptitudes, forms ic and id became* obsolete. The huge number ol men 
and women who had been tested with these forms, and the vast amount 
of educational and occupational validation data which had been accumu¬ 
lated for them, made them unique in the history of psychological or 
vocational testing. Two loims were therefoie leleasecl lor civilian use, 
the first civilian edition appealing as lorms AH (hand scored) and AM 
(machine scored). 1 his, it should be noted, is AGCT-ia, which is not the 
widely used Ai my form, but its picdccessor, to which the widely used 
forms, ic and ul, were calibiated. 

Applicability. The AGCT was designed for use with draftees, that is, 
with young men between the ages ol 18 or i»o and 3b, with widely varying 
amounts and t)pcs of education, and with even gicater dillcrences in 
general cultural background. In order to make the test applicable to this 
group, an attempt was made to avoid items which might be gieatly 
influenced by schooling beyond the first few grade's and by other cultural 
inequalities. Information items weie not used. Instead, vocabulary, 
everyday arithmetic, and spatial items were included. A special elh>rt was 
made to make the items seem sensible to young men from all walks of 
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1 1 fe. The data on (1 1 si t 1 1 )iitions <">1 lest s(c>i es, for example the occupational 
norms to be (lis< ussed Intel, indicate lh;il the objective ol getting a xv jde- 
range intelligence test ol leasonable l>ie\iiy an as achieved. It xvas used 
aJso with \otmg women who \ohmfeejed /or the .Inin, and die data on 
such gJ oiips gho no i casous /or cp/esfjon/ng its app/ic .ihilitv to women. 
Observation of the use ol i he test A\ilh both men and Avomen suggests 
that the 1 }’ find the types of items ac (ejila.b/e, although t/ie block-counting 
sections apparently make* a special impression: the test is often referred 
to by examinees as “that test Avith block counting in it.” 

As military experience showed that young men and women Avith xvicklv 
\aning amounts of education seemed to be able to manage this test, it 
seems likely that it could be used also in the last yeais of high school. 
However, no ohjecthe e\idence on this point has as yet been published 
Its use with oldei people might be cjuestioned, for although the conela 
tion Avith age in a representati\e enlisted population xxas only . 02 , it x\ »s 
-.gg and -.20 lot two gioups of ofliteis which included many men avIm- 
were older than most dial tees (7gf>.7g7). As pointed out in the official 
leport, this is probably chic* to the influence' of the speed factor, although 
an attempt had been made to minimi/e that b\ a time limit in which all 
examinees could, il not finish, at least show their power. 

Content. Tin* test consists of tlnee paits: \ocabulary, arithmetic 
ptoblems, and block counting. Three practice paits introduce the test 
to insme familial ity xvith the piotceluie. A sample vocabulaiy item is 
“To permit is to, a) demand, b; thank, c) alloxv, d) charge.” T he arithme¬ 
tic pioblems imohe leal life' situations, such as dividing rounds of 
ammunition among a group of men, finding out hoxv many more coavs 
one man has in comparison xvith his neighbor, and computing the amount 
of monev each man on a baseball team xvould ha\e to contribute in order 
to sup] ill'll lent the c lub’s tt easur\ in busing uni lot 111s. T lie block-counting 
items are of the Lmnliar t\pe, like tliose used in the MacOuanie 

Time aic* go piaciice items in the Chilian edition, and 1 go test items, 
in contrast with 10 ptactice items and i jo test items in Army editions ic 
and id. T he manual does not indicate Avhich Army form Avas used, but 
this fact suggests that it is one of the txvo older forms (actually Form 1a. 
confirmed in a letter liom John Yale of Science Research Associate's, 
dated April 1 j, 19 |R). AC.CT-ia was standardi/ed on 2(175 men aged 
20-29. Form lb was standaidi/cd on gRgf) men Avho also took Form 1a, 
i’i H).]i. 7 Tie col relation betAveen scores on the txvo tests was found to 
he* .95, and their means and standard dex iations were practically identi- 
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cal. Forms ic and id were prepared immediately after ib, administered 
to 1782 men, and compared with 1a. T he two new forms were found to 
be somewhat more diihcult than 1a, and somewhat more discriminating 
in the upper ranges (736:763); no comparisons were made with lorm ib, 
but presumably the same would be true of it. 

Administration and Scoring. T he testing time is 40 minutes. Directions 
in each booklet are complete, making the test sell-administering. I he 
civilian edition uses the step-back lormat, in which each page is slightly 
narrower than the one before it and the answers are recorded on succes¬ 
sively exposed columns of the answer sheet. T his has the great advantage* 
of making a manageable booklet and answer sheet, and o 1 minimizing 
recording errors. The hand-scored form provides the examinee with a pin 
with which to prick holes in the answer sheet instead <>1 marking it. 1 he 
holes which appear in marked areas of the back ol the answer sheet are 
counted and indicate the number ol right answers. .Scoring takes only 
about one minute per test. Raw scores are comerted into standard scores 
known as Army standard scores, Jor which the mean was intended to be 
100 and the standard deviation 20. These can also be converted into 
percentiles, a table in the manual being prmided tor this purpose. 

Norms. As the extensive Army norms are lor AGCJ n and id (more 
than 8,000,000 men), it is to be regretted that the civilian form is one* ol 
the preliminary editions. As they are \eiy similar, even though not 
identical, it may be safe to use the general norms. 

The manual provides a table for the conversion ol raw scores into 
Army standard scores and percentiles. T here is no indication as to 
what size or type of group this table is based on. It is military, but whether 
or not it is the standardization group lor the same lorm, or a much larger 
group tested with equaled forms, is uncertain. A sentence elsewhere in 
the manual indicates that it is based on 100,000 (unclescribed) inductees. 
The mean raw score of the standardization group used with lorm 1 a was 
78, which gives a standard score ol 102 and a percentile of 45 according 
to the manual; the percentile would be* the 50th il the* standardization 
group were the norm group. As the manual’s norms arc for a larger num¬ 
ber of persons than were tested with this form, data from other forms 
which had been calibrated wdth this one must have been used. Such 
matters should be made clear in the manual or in accompanying publi¬ 
cations. 

Occupational norms, in the form of bars representing the middle 50 and 
80 percents of each of 120-odd occupations, are also included in the 
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manual. Again, it is not clear what forms of the test were administered 
to these groups. 11 ic and id were used, the norms may not be strictly ap 
piicable to the civilian form (AGGT-ia), which was found to be easiei 
and less discriminating at the upper levels. Persons of average or high- 
average ability would seem more able to compete in executive and 
professional work than they actually are. The means in the manual's 
occupational norms are almost identical with those of the longer list of 
occupations covered by Stewart’s analysis (75K), but the numbers of cases 
aic in some instances smallei, and in some larger, than hers. 

Standardizatmn. The standaidi/ation of the various forms of the 
AGGT lias been described in leaddy available journals by the staff which 
developed it (736,737) and need not be icpeatcd here. Steps which should 
be noted include the fact that a laige item-pool was developed, and the 
seeminglv most appiopriate items were selected from it; each successive 
lot in was equated with the pluvious forms (but as noted previously 1a 
and ib weie somewhat easier than ic and id); the estimated mean of the 
first fo] m proved to he too low, so that when the calibrated scores of later 
forms weie standai di/ed tlie actual mean standard score was between 100 
and 1 m lather thai 1 100, one sample oi 11101c than 91,000 men having a 
mean of 107 (77* j). 

The reliability of the various (onus was ascertained, the iciest relia- 
bilitv with vaiving intervals between tests being .82, the alternate-form 
icliabilitv between .89 and .95, the Kudcr-Richardson reliability between 
.9 I and .97. and the corrected odd-even reliability .97 (736:765). These 
are cpiite satislac lorv. 

Validity . As the AGGI' was devised as a measure of learning ability 
and rout inch administered to all enlisted men and women in the Armv, 
it was used as a picdictor of success in training for many types ol special¬ 
ties. But it was also possible to relate scores on this test to certain criteria 
Iiom the previous civilian experience ol the persons tested, such as the 
amount of education they had obtained (it having been well established 
h\ other studies that brighter people tend to get more education) and 
civilian occupation (it has been seen that occupations can be ranked ac¬ 
cording to their intellectual requirements). 

Education , as measured by the highest grade attained, was correlated 
with the AGGT scores ol 4330 men, the coefficient being .73 (736). This 
may be unduly high, because socio-economic status is correlated with each 
of these variables, but it is an indication that the test has some of the 
validity which has generally characterized good intelligence tests. 
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Tests of intelligence which ha\c“ been con elated with the AGCT 
include Anny Alpha, Otis S.A., and the American Council on Education 
Psychological Examination (7;p>). The most r c*j^i c‘sentati\c‘ populations 
for which such data have been published ranged in numbeis liom 750 
to 1 (i.](). The* conelations weie .90 lor Army Alpha, lor the Otis, and 
.79 lot the' A.C.E. 

Other tests with which the AGCT has been c on elated include those 
used in the selection ol aviation cadets (cm j: Table 9.9). The con elation 
with a test ol i catling coni prehension was .r K g, met htinnal com jn chcnsmn 
and mat haunt u s 1 be cot relations with tests ol manual dexfenty, 

1 o-ovdniatmn . and similar capacities weie geneialh below .20. These data 
were* obtained lrom a gtoup ol mote than 1000 unselected applicants lor 
(adet training. 

Success m tunning was the most commonly used auction lot the \a 1 i- 
elation of the- AGO 1 . A simiman ol such results was (omj)iled by the stall 
ol the Personnel Reseat ell Section ol the Adjutant Genet al’s Olhce (7^(9. 
and is repioduccd heie with adclition.il data bom Du Hois (21 \). I he 
means and sigmas ol the \arnus imln.n\ naming gioups ate gi\en, to¬ 
gether with the conflations with the* ciiteria. As the authors point out. 
preselection ol students, sometimes on the basis ol this test, makes the 
relationship seem lower than it actually is, in some instances, whcieas in 
otheis the true tclationship is shown. Motoi mechanics, lot example, 
were not piesclcctcd, and 1 equalled .(>9; teletype maintenance students 
were pieselec ted, and in their case r equalled .20. It would be necess.n \ 
to sort these data into at least two groups, according to whether 01 not 
they had been preselected, in order to geneiali/e concerning the 1 types 
ol training in which the test best j>1 eclicted success. I.yen then, it would 
be necessary to be cautious, because ol the piesumabh academic nature 
of much ol the training, eycn lor specialities which were eery conciete 
and practical. The example ol \ayy aerial gunneiy naming has been 
cited elsewhere (p. y } ) as eyidence ol the lact that intelligence tests 
sometimes pi eclic t success in naming because the tiaining is unnecessarily 
abstract, and that when the training is made* moie lile-like intelligence 
tests lose their predie the yaluc*. 

It is worthy ol note, as the AGO authors pointed out, that the correla¬ 
tions between AGCT and grades in Army Specialized Gaining (college- 
courses) and also in most West Point courses, tend to be- 1 cm. They 
range from .12 to ..jo. The authors point out that this is no doubt paitlv 
clue to the extieme pieselection which had taken place- in both pio- 
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giams. Duspitr this, however, ihc correlations with grades in English 
and Mathematics at West Point were . jo and ..pp Strong (776) has pointed 
out another reason for the poorer predictions in specialized training, 
nanieh, the fact that a substantial number of men weie sent to training 
in 'which the\ had little genuine interest, either because they thought it 
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would be a pleasant type of assignment oi because (juntas had to be 
Idled. Willi motivation undermined in this lattei way, the (onelation 
between ability and grades would be definitely loweied. 

The value of the AGCT as a predictor ol success in pilot training 
can be ascertained by comparing it with the tests ol the Aviation Cadet 
Selection program. It is obviously not rele\ant to compare it with tests 
of special aptitude, interest, or tewpeianient, but it may legitimately he 
compared with the general qualifying examination administeied to ap¬ 
plicants for pielimiiiarv screening, in older to ascertain the illative 
value of general intelligence tests and of custom-built tests of ability to 
adapt to the learning requirements of a specific training progiam. T able 
10 has shown that in the exjjerimental giouj) ol moic than moo (adds 
sent to pilot training regardless of test scoies the \GCT had a validitv 
ol .31 with a pass-fail criterion. For this same group, with the same 
ciiteriou, a test ol learning ability designed with living training sjiecih 
<allv in mind had a validit\ of .;,o (lm j: 1 <> 1). T he pilot stanine (weighted 
combination of special aptitude test scores) had a validitv ol .(We 
Obviouslv, although the general intelligence test had some value lot 
])iedi(ting success in j >i lot training, it did not measuie ceitain lac tots 
which weie of considerable' impoitance and which weie tapped 1>\ the 
more speiiali/ed tests. 

Occupational cliffn enc cs have been studied with the AGCT as with 
Army Alpha, but so far only fear a one percent sample ol the tested 
population. Some of the data for this test aie piesented in 'T able \. on 
pages p(>-<)7- Stewait’s pajier ( 7^S ) has shown that, as in the- case* ol W'mld 
War I data, occupations can he tanked according to a hieraichv of mlc 1 
Iigence, tliue is considerable overlajqung ol occ upational groups. and 
the spread ol intelligence is greater in the lowet-level (less selective) than 
in the higher-level (mote selective) occupations. It is woith noting that 
although (jo percent ol the highest ranking oc cnpational group in either 
sample. accountants, made scores of 1 1 j or better, more than 10 percent 
of the men in the least able occupational gioup. lumberjacks, made 
equally high scores. The overlapping is e ven greater among the occupa¬ 
tions which are nearei to the middle ol the distribution. Scotes on this, 
as on other, intelligence tests can therelorc give only a veiv general 
indication of the occupational level at which a person might best aim, 
notwithstanding the great variety of .available occupational norms which 
seem to indicate the contrary. 

Stewart’s analysis compaics occupational ranks in Wot Id Wat II 
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with those found in World War I data. She lound that only gunsmith, 
toolmaker, machinist, telephone and telegraph lineman, locomotive 
fireman, meat cutter, and boilermaker had made appreciable gains in 
position relative to other occupations. Occupations which had lost status 
were draftsman, file cleik, electrician, auto mechanic, pipe fitter, auto 
serviceman, chauffeur, and mototcvclist. As Stewart points out, it is 
difficult to know just how to interpret these differences, or the relative 
lack of differences, between the two sets of norms. The sampling of 
occupations during the two wars may have been different: certainly 
selective service did not operate on the same principles, and some occupa¬ 
tions may have been granted deferments more liberally in one war than 
in the other because ol dillcring industrial needs. This would result in 
inferior membeis ol an occupation being its representatives in the war 
in which their group was considered essential to the civilian war effort. 
In the absence of detailed information on the basis of which corrections 
in the occupational means and deviations can be made, one can use the 
\riny occupational intelligence data only as a very rough guide. 

A scemmglv sound lot m in which AGCT occupational norms have 
been presented loi this tv pc ol use is the table prepared by Stewart and 
reproduced earlier in this chapter, in the discussion of occupational 
intelligence levels (pp. qb-cg). In this table will be found broad groupings 
of oc < upations on the basis ol the AGCT scores characterizing their mem 
hers. This arrangement minimizes the likelihood that undue emphasis 
will be placed upon insignificant diflerences within a level; but at the 
same time it r isks overemphasizing the importance of differences between 
to]) and bottom occupations in adjacent levels. One wonders, for ex¬ 
ample, whether the dilletences between chemists and lawyers are as 
great as the fact that one falls in Stewart's highest group and the other 
in hei next highest group implies. The difference is, actually, one of 
three AGGT sc oh* points, or less than one-filth sigma. Although the 
writer has used such tables, and reproduced one based on Army Alpha 
in an earliei text (7(g;:f,b). it now seems wiser to work from a graph 
such as that provided in the manual. I he scaled arrangement permits 
the counselor and client to study bioad groupings bv drawing lines 
whet ever they mav wish, and at the same time encourages the realistic 
consideration ol overlapping and of relative standing in a \ariety ol 
occupations. Data lor a longer list of occupations will be found in the 
Stewart reference (7r,8: fable 1 ). 

Use of the AGCT in Counseling and Selection. It is clear from the 
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relationship between the Army General Classification I est and other 
standard tests of intelligence that this instrument is a measure of learn¬ 
ing ability. This conclusion is reinforced by the consistently significant 
correlations between AGCT scores and success in administrative, clerical, 
mechanical, electrical, academic and other more specialized types of 
training in the Army, even though the nature of the data did not permit 
generalization concerning its relative importance in each ol these types 
of training. 

Although no evidence is available concerning the relationship between 
AGCT scores and occupational success, the data on di/lerences between 
occupational groups have been seen to confirm the opinion that persons 
with higher AGCT scores are likely to make good in higher level occ upa- 
tions. The details are in general agteement with the findings of studies 
made with other tests, so that genetali/ations can probabh be made from 
this test as from other standard tests of intelligence. These would be to 
the effect that those with high scores are most likelv to master new jobs 
rapidly, to 1 ise to positions of responsibility, and to be* satisfied in high 
level occupations. 

The lest can be used in high schools, college's, guidance centers seiving 
adolescents and adults, employment oflices, and business and indust) ial 
establishments. It is perhaps unfortunate that the name “Aimv” has been 
kept on the test booklets (although it should be identified conectlv 
among professional users), as this may injure tapport with some subjects. 
Experience will no doubt throw more light on this problem. The con¬ 
tents and form are cpiite appropriate despite* some items dealing with 
military objects or situations. The occupational norms make the test 
useful for vocational counseling, and for selection in the absence oi local 
norms. The lack of college student norms makes it less useful than the 
A.C.E., Otis, and certain other tests for educational guidance*, but this 
defect is to some extent remedied b\ the availability of means lot certain 
special types of college students, and In the substantial conelations 
found with grades in various tvpes of tiaining courses. 

The Thurstone Tests of Primary Mental Abilities ^American Counc il on 
Education, 1938, jy.ji; Science Reseat c h Associates, 1 < > j 7) 

The Tests of Primary Mental Abilities wete developed by the Thur* 
stones in an attempt to provide practical batteries of tests implementing 
their work in the isolation of primary mental abilities. The “Chicago” 
(long-form, two hours) and “SRA” (shoit-ioim, 95 minutes) Tests were 
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designed for list* primarily ai tlic high school level (8.13); another battery 
has been added lor the lower age levels. Only the long experimental and 
“Chicago” forms are discussed here, as there are very few data concerning 
the short forms. 

Description 

The Chiiago tests were standardized on children in the higher grades 
and in high school, and ate theieloie designed to be applicable to chil- 
dten aged through 17. Approximately 1000 children were tested at each 
half-year; it was administered routinely to all 8B and 10B pupils after 
1 ()11 —.j2, in Chicago schools. While this means that the norms are not 
truly national, they do lepresent the school population of one of our 
largest cities and pio\icle useful norms; it would still be desirable to 
have national norms, but even more important is the accumulation of 
local norms by other school s\stems, colleges, and organizations using the 
tests. The batter\ consists of 1 1 tests, selected from the bo tests tried out 
experimentally on 1 1 r, j pupils and subjected to fac tor analysis, and a 
second experimental batter\ of 1? 1 tests tried out on .jg" subjects and 
fjctoriallv anah/ed. I hose 1 1 tests measure six primary mental abilities, 
named Verbal Meaning (V), Space (S). Number (X). Memory (M). Word 
Fluency (W), and Reasoning (R). These are measured by tests such as 
\ocabularv and opposites (Y), flags and cards (S), addition and mult'plica¬ 
tion (X), and letter grouping (R). Two tests are used to measure each 
of the six abilities except memory, tested by one test; they ate arranged 
in booklets ’uhich can be administered in two school periods. Each test 
is accurate 1\ timed, with a piaetice exercise preceding it. They tan be 
scoied b\ hand or b\ machine, perforated stencils being provided for 
the lonnet. 

/ caluatiun 

The success ol the Thurstones in constructing a practical battery of 
tests ol piiman mental abilities is ob\ iottsly an important question. An 
easily administered and scored, reliable, and valid battery would repre¬ 
sent a major advance in aptitude testing, as it would make possible the 
measurement of a number ol aptitudes which arc wide ly used and which 
are of van ing importance in different t\pes of acthities. Recognition 
of the* importance of this possibility is shown by the fact that, although 
Thurstone’s experimental tests were published in 1938 and the definitive 
batten onl\ in lup. there ha\c appeared, within the years since then. 
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almost a scoie of studies of their reliability and validity. The short forms 
should be subjected to even closer scrutiny. 

Influence on Current Test Construction 

The influence of Thurstone’s factorial analyses of mental abilities has 
not been limited to these attempts to validate his tests: it has manifested 
itself in verbal and quantitative scores of the American Council on 
Education Psychological Examinations which he developed (see pages 
ii.j to 12.]), in the performance and \eibal I. Q.’s of the: Wechslcr-Belle- 
vue (see pages i.]2 to i j 6 ) and ol the California lest of Mental Maturity, 
in the Arithmetic Reasoning, Verbal Comprehension, and other similar 
tests of special mental abilities used in the Engineciing and Physical 
Sciences Aptitude Test (p. g]i), ill the Navy’s Basic Classification Test 
Battery (7.J0), in the* United States Employment Service’s experimental 
test batteries (22.]), in the Psychological Corporation's Diflcienti.il Apti¬ 
tude Tests (p. g(>S), and in the* A\iation Cadet Classification 'I ests (21 j) 
of the Atm\ Air Forces. Test batteiies such as the last fom yield no 
I. Q.’s but, instead, yield part scores which, in a gi\en selection program, 
are weighted according to their differential predictive value, in accord¬ 
ance with a concept of constellations of abilities needed in vat ions 
occupations lather than of general ability requited in varying amounts 
in diflerent occ upations. W’e haw seen that the* use ol cjuant itati\e and 
verbal scores is still somewhat ]>ioblematical in the case* of the A.CT. 
tests, and e\cn more so in those of the* California and Wechsler-Bellev tie 
Tests. The Engineering and Physical Sciences Aptitude lest is as yet 
virtually untried in this lespect (see pages gjt to ;; 12). and the USES 
tests, not yet released fot general use*, haw been validated only in a 
pieliminarv way on small groujis. The Aviation Psvchologv Piogiam 
(21 ]) used tests of this type to good effect, as demonstiated 1>\ tlie confla¬ 
tions in Table 11, which indicate the differential piognostic value of 
some of the lactoi ial-type tests for pilot and navigator naming. Multiple 
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correlations lor battetics which also include tests of other types were in 
the .do's. 

In view ol the widespread influence of Thurstone’s work, and the 
imj)oitant role which it is playing in shaping the intelligence test 
constitution work now being done, it seems essential to discuss in some 
detail the practical work which has so lar been done with the Priinan 
Mental Abilities T ests. 

Studies of the Tests as Sudi. 1 taxler (Hr,6) ascertained that tin* 
i(’liabilities ol the original Primary Mental Abilities Tests were hiuh 
judging by both the split hall and the letest techniques, but attributed 
this to the impoitame ol speed in all ol the* tests. Results lor O73 liesh- 
nicti at (he Lnivctsity ol Chicago were analvzed by Stalnakcr (7.) j), also 
in order to evaluate the adequac.) ol the standardization of the adult 
tests. Me leported that the tests used to measure a given factor had 
in lei c on el.it ions of .iro to .79, the* mean being ..pj. Goodman (-97) re¬ 
potted slightly lowei coefficients. These seem rather low, but the inter- 
conelatious ol tests not used to measute the same lac tor tange from —.17 
to . jp. most ol them being under .20. More serious than this, perhaps, 
is the lad that the* items weie lound not to be* in the older ol dilhciiltv, 
and that some items weie inellectivc. His conclusion was that the tests 
were not vet reach lor use with individuals. 

Adkins and Kudrr (S) administered the original Primarv Mental 
Abilities T ests and the Kudei Preference Record to more than 500 f’esh- 
men at the l niveisitv ol Chicago, and lound relativelv little overlapping 
between the two sets of measures. What overlapping there was seemed 
reasonable in view ol the nature of the tests in question. Shanncr (711) 
reported a studv made at the secondary school level at about the same 
time. He 1 concluded, from evidence geneiallv similar to Stalnaker’s, that 
the tests weie uliable and had sulfa ienth low intercon elations to 
indicate independence ol die traits measured. Although concluding that 
the tests need mote lesearch lor refinement and interpretation, he stated 
that thev aie a valuable addition to the held of aptitude testing. Issue 
was taken with this conclusion by Crawford (178), who presented the 
test inure one lat ions b\ hequencies rather than averaging them and 
concluded that thev weie not sufficiently independent. He also pointed 
out that the con elations between PM A tests and Co-operative Achieve' 
merit Tests weie low. and concluded that the tests do not have demon¬ 
strated diagnostic value. Fortunately, more satisfactory evidence is now 
available*, to take the issue out of controversy and into the realm ol fact- 
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Applications to Education and Vocations. The experimental edition 
of the PMA Tests was given to 501 University of Chicago fieshmen hy 
Shanner and Ruder (71 l»), together with a number of other tests, and 
correlated with grades on comprehensive examinations taken to scenic 
exemption from freshman courses. Results are picscntcd in I able 12. 
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It can lie seen fiom I able 12 that the two espeeiallv constituted apti 
tude tests yield the highest validities for the appropiiatc subjects, and 
validities at least as high as any other test lot average grades. It would 
seem probable, in view of the multiple conelations between PMA tc*st> 
and subject grades, that the se* tests would pi edict avenge grades about 
as well as the A.C.F. and the special aptitude tests, were it not lot the* 
tendency of multiple correlation coefhcients to shrink. For spi*cial sub¬ 
jects these last have the advantage of being based on job analysis and ol 
being basically miniature situation tests; the presumably gicater versa¬ 
tility of the PMA tests makes them mote desirable lot selection in in¬ 
stitutions which cM not have large test construction stafls and lot general 
vocational and educational counseling. When the PMA Tests ate com¬ 
pared with the* A.C.F., it is notable that no single PMA laden' is as good 
a predictor as the test of general scholastic aptitude (although the* veibal 
and deduction lac tens do about as well for certain courses), and that the* 
multiple correlations between PMA Pests and grades in specific courses, 
while generally higher than those of the A.C.E., are not usually sufh- 
ciently greater to justify the additional time and effort in test administra¬ 
tion and scoring. 

In a study by Yum (951), also ol University of Chicago freshmen, some¬ 
what less promising results were obtained. He computed one relationship 
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not reported by Shanner and Kudcr, namely a multiple correlation be¬ 
tween PM A bests and semester average. More important still, he used 
actual grades for students taking the courses, rather than grades based 
on examinations taken to obtain exemption from the courses. The cor- 
telation was .42, which is considerably lower than that obtained by 
Shanner and Kudei lor any single subject, and lower than their multiple 
correlation would presumably have been had it been computed. 

Ellison and Edgerton (239) used the experimental tests lust published 
by Thuistone with 49 libel al aits students at the Ohio State University. 
Only the Verbal and Memory Tests had moderately high correlations 
with point-hour averages (.]} and eji respectively), but the multiple 
correlation 1 or weighted scores was .64. The results for grades in specific 
courses were most piomising, lor Verbal, Spatial, and Deductive bests 
gave better predictions of English grades than did the Ohio State Psy¬ 
chological Examination (.73, .4 j, and .44 as opposed to .42), the Verbal 
best predicted Science grades better than the general examination (.68 vs 
. j2). and similar results weie reported 1cm foreign language grades and 
lot psychology grades. I he* numbers in each case were, however, between 
2“, and 30, and the results seem almost too good. 

Most helpful is a series of studies conducted at the Pennsylvania State 
College under the direction of Robert G. Bernreuter. Ball (40) admin¬ 
istered the older Tluirstone battery to 147 freshmen women ana 159 
men in the 1 hbeial arts college. The correlations with semester point 
average ranged horn .oj lor the Spatial bests to .33 tor the Verbal. I he 
multiple correlation lor Memory, Number, Verbal, Induction, and 
Deduction 'J ests and semester point average was ..46, which is no better 
than what one would expect from a much briefer scholastic aptitude 
test. Some of the tests of specific factors correlated substantially with 
appropriate college marks, the coefficient for Number and Mathematics 
being . j 1, and Verbal and English Composition . }o. The Verbal Tests, 
however, tended to have moderately high correlations (.20 to .40) with all 
courses. Hessemer (3(H)) analyzed PM A 'best scores for 147 freshmen 
women, using first semester point average and grades in inorganic 
chemistry as her criterion. The Verbal Tests were again the best predictor, 
with a correlation of .44 with semester point average; Deduction followed 
closely with one of .40. There were no satisfactory correlations, however , 
with chemistry grades, that for the Verbal Tests being .rv}, and the two 
highest being —.23 for the Spatial Tests and .18 for the Deduction Tests, 
the* irreconcilability of which relationships suggests their chance nature. 
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Bernreuter and Goodman (88) obtained data for 170 freshmen engi¬ 
neers. In this instance the correlations between PMA Tests and semester 
point average ranged from .04 (Perceptual) to .38 (Deduction), the 
multiple correlation being .51 for Number, Verbal, Space, Induction and 
Deduction or Reasoning Tests. Again the Verbal Tests yielded significant 
correlations with all courses (except Drawing); the correlation between 
Verbal Test and English Composition grades was .44, and that between 
Number and Mathematics grades was also .44. I hilorfunately this study, 
like the others just summarized, provides no \alidity data lor tests of 
general intelligence, which might enable one to decide whether or not 
the extra time required by the PMA battery is justified by higher valid¬ 
ities. This defect is remedied by another Penn State study by Tredick 
(869) who tested 113 freshmen women students of home economics 
with the PMA battery, the Otis, and several other tests. The results, 
shown in Table 13, are in line with the trend of those so far reported, 
in that the Verbal Tests tend to give moderately high predictions of 
grades in all courses and especially in English (.55), the Number Tests 
have a substantial correlation with chemistry grades (. {(i), and Induction 
and Deduction are also good predictors. Most interesting, perhaps, is 
the fact that the multiple correlation coefficient of .f>i for four PMA 
Pests (NMD) and semester point average is substantially higher than 
that of .53 between the Otis and the same criterion, but the R was 
apparently not corrected lor the shrinkage which usually takes place 
with a second group. 

It is interesting to note, in passing, the correlations between PMA 
Tests and the tests of general and special aptitudes used by Tredick. All 
of the former have moderate or high conclations with the Otis (.29 to 
.68), only the coefficients lor the Number and Memory Tests being below 
.4o. The perceptual factor is important in the Otis (r — .53) (presumably 
because of the emphasis 011 speed), the Minnesota Vocational Test lor 
Clerical Workers (.57, .51) and the Minnesota Spatial Relations Lest 
(.55), but much less so in the Minnesota Paper Form Board (.39). The 
number factor is highly correlated not onh with the Minnesota Clcriial 
Numbers (.59), but also with the Names (.58) Test. The verbal factor is 
very important in the Otis (.68), and of moderate importance in the 
Clerical Names (.40) and Art Judgment Tests (.39). The spatial factor 
plays a moderately important part in all of the tests in the study except, 
interestingly, in the Art Judgment Test, where its role is of only slight 
importance (.20): it is most closely correlated with the Minnesota Spatial 
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tests (.37 as contrasted, e.g., with .41 and .36 for Clerical Names and Num¬ 
bers). This suggests that the so-called spatial factor measured by the PMA 
Tests may be moie general than strictly spatial. T he memory factor is 
moderately correlated only with the Otis (.29); other coefficients arc about 
.20 or below. Induction plays important parts in the Otis (.60) and in the 
spatial tests (.48 and .47), is moderately important in Clerical Names 
(.44), and of some importance in the other tests used. The deduction 01 
teasoning factor plavs a similar role, but is somewhat less important in 
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the Minnesota Spatial Relations Test than in the Paper Form Board 

(r’s of .33 and .45 respectively). 

Goodman (297) reviewed the work done by other investigators at Penn 
State, and reported further research of his own with engineering fiesh- 
tnen. The correlations between PMA Tests and first year semester point 
averages ranged from .08 (P) to .3 j (V) and a;f> (I)). The Number factor, 
which one might logically expect to yield one of the highest r’s with 
engineering grades, had a correlation of only .36 and the spatial factor 
only .18. This tends to support the conclusion drawn earlier from work 
with the A.C.E. part scores, to the effect that verbal and “general” 
intelligence tests ate at least as effective picdictors of success in technical 
courses as are more quantitative tests. Goodman also obtained the inter¬ 
correlations between specific tests in the PMA battery, and between 
these tests (as contrasted with combinations of tests which measured 
specific factors) and the criterion. This analysis showed that the* inter - 
correlations of tests measuring the same factors ranged bom .01 to .72, 
with a median of .33, which suggests that the measurement of specific 
or priman factors still leaves much to be desired; it also repealed that 
some of the specific tests had higher correlations with the criterion than 
did the factor scores to which they contributed. This last finding is not 
surprising, as a test of mixed factors might predict success in a task 
invoicing some of those same factors better than a score representing 
more* adequately one “pure” factor which is only one contributor to 
success in the activity in question. 

A few other studies which have been reported show results similar 
to those just reviewed. Stuit and associates (787) administered the- PMA 
Tests to students in engineering, medical, and journalism schools, and 
reported characteristic profiles. Engineers were high on S and D, low 
on V and M; journalists weie high on P, N, and V, low on M and D; 
medical students were high on P and I. This suggests that the battery 
should be useful in guiding students into curricula in which their 
abilities resemble in type those of the majority of students. More work 
should be done along these lines, as the differential use of the tests 
should be one of their principal contributions. However, this type of 
standardization has merely been begun. 

Perhaps the nearest thing to validation in terms of vocational criteria 
has been carried out by Harrell and Faubion. (338), who administered 
the experimental PMA Tests to 105 men in aciation maintenance 
courses in an Air Forces Technical School. The multiple correlation of 
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Verbal, Spatial, Induction, and Deduction Tests with average grades 
was T3, which contrasted with a correlation of .45 between Army Alpha 
scores and grades. The Number Tests correlated most highly with grades 
in shop mathematics (.37, contrasted with .31 for Army Alpha and .46 
lor the combined PM A Tests); the Verbal Tests correlated most highly 
with grades in Electricity (.51, compared to ..\7 for Army Alpha and .57 
lor the combined PMA Tests); and the Deduction or Reasoning Tests 
predicted grades in blue-print reading and mechanical drawing most 
effectively (.5 j, compared with .30 1 or Army Alpha, .3b for the Spatial 
Pests, and .ho for the combined PMA battery). 

l T \r of the PM 1 Trsl.s m Votational Counseling and Selection. The 
studies reviewed in the preceding pages make it clear that the long forms 
of the Thurstone '1 ests of Primary Mental Abilities, while sufficiently 
perfected to make possible important research into the nature and or¬ 
ganization of human abilities, still need to be improved before they be¬ 
come a practical instrument lor use in guidance and selection. The de¬ 
fects in the tests ha\e been summarized by Crawford and Burnham 
(1 Ko:e 13) . 1 he measures ol specific factors are still somewhat impure, as 
shown 1>\ the' model ate rather than high intercorrelations of tests used to 
measure a given factor. Speed plays too important a part in all the tests. 
1 lie relationships between specific factors and other tests or criteria with 
which they might be expected to be related are often low enough to 
make one question the adequacy of the measurement of the factor (e.g., 
the spatial factor) On the- other hand, there arc a number of findings 
which are extrcinch encouraging. Among these are the generally higher 
multiple correlations fret ween PMA Tests and criteria than among 
general intelligence tests and criteria, which suggest that in selection 
work especialh it will be advisable to use this more refined type of 
measure, to obtain differentiaf occupational weights, and to score 
accordingly. In time*, accumulated data may make these differential 
weights useful in counseling: that is, the* score for an improved spatial 
factor might Ire multiplied fry 5. that lew the number factor by 4. that 
for the verbal factor b\ l\ etc., in order to compare the promise of a 
counselee with that of others who have entered technical occupations, 
whereas the same scoies would be multiplied by weights of 1, 5, and { 
in order to compare his promise with that of men in accounting occupa¬ 
tions. This technique was applied to potential pilots, navigators, and 
bombardiers in the Army Air Forces with considerable success, is being 
experimented with by the United States Employment Service, and may 
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become possible with the PMA Tests as they are improved and occupa¬ 
tional norms are accumulated. In the meantime, it should be remembered 
that these tests are still a promising device for research rather than a 
practical tool for counselors or personnel managers, and that the short 
forms are as yet untested. 

The Wechslcr-BcUevuc Scales of Menial Ability (Revised manual: Wil¬ 
liams and Wilkins, 19.13. Test materials: Psychological Got potation) 

The publication, in 1939, of the Wcchsler-Bellevue Scale of Mental 
Ability as an individual intelligence test designed lor use with adults 
rather than with children immediately focussed the attention of the more 
clinically minded psychologists and counselors on this instrument, even 
when the natuie ol the counseling problem was largely vocational and 
educational. The aura surrounding individual testing, as opposed to the 
supposedly less sensitive measurements obtained from group tests, alone 
was a sufficient cause of such interest in the Wechslcr-Bellevue. To this 
appeal was added, howc\er, that ol a test which yields two types of 
scores, one based on \erbal and one on performance items. I he fact 
that the scale was developed in a mental hospital, primarily lor the 
diagnosis ol mental delects and mental impainnent in adolescents and 
adults, and that all the original material on the test was directed towaicl 
these uses (91.]), resulted only in greater confidence on the* pait ol the 
clinically minded who proceeded to use the scale in \ocational and 
educational guidance. Because of their widespiead use in guidance 
centers some aspects of the Wechsler-Bellevue Scales are considered 
here, more as a caution to users than as a guide to use in \ocational 
counseling. 

The question of the clinical usefulness of the scales is death quite 
independent of the question of its usefulness in \ocational counseling 
and selection. When considering the use ol such a test in \ocationaI 
guidance and personnel wotk. three questions ate televant. I nst, what 
advantages, if any, does an inclix idually administer ed te st of mental 
ability have oxer a group-administered test in \ocational guidance or 
selection? Secondly, how good is the instrument as a test of general 
mental ability? Thirdly, what evidence is there concerning the occupa¬ 
tional significance of total and pat t scores, particularly the latter ? Each 
of these cpiestions will be dealt with briefly in the following paragraphs. 

Individual vs. Group Tests. The relative advantages and disadvan¬ 
tages of group and individual, perfoimance and paper-and-penc il tests 
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have been discussed in Chapter .]. But for the sake of. convenience a 
few especially pertinent points should be made here. Tests designed for 
group administration can also be administered individually, and there¬ 
fore have the advantage of being more flexible in their use. On the other 
hand, they are generally paper-and-pencil tests, which do not have the 
flexibility that orally administered individual scales such as the Wechslci- 
Bellevue possess. In the former, the examinee reads and answers questions 
by himself without the examiner being able to judge his reactions by 
art) thing more than expression and gestures, and with no possibility of 
modifying the questions to suit the background of the subject. In the 
latter, the administrative procedure is more conversational, and there¬ 
fore the examiner has much more opportunity to judge the reactions of 
the subject and to modify procedures in such a way as to be completely 
fair. In clinical work the desirability of the latter type of technique is 
obvious, for then one is working with cases whose background or condi¬ 
tion is unusual in some tespects and it is important that the test situation 
permit the examiner to observe these abnormalities and to modify the 
test piocedme accojchngh in some instances and to note them for 
diagnostic use in othets. But in vocational and educational counseling 
or selection the examiner is dealing with persons whose condition is 
approximately normal and whose background is such as to make stand- 
aicli/ed techniques appropriate. For each normal counselee or employ¬ 
ment applicant there is a suitable group test of mental ability, developed 
for use with and standardized on subjects such as he: modification of 
test proceduies is thereloie generally unnecessary if the examiner has 
background data on bis subjects and chooses his tests well. Furthermore, 
the normality of the* examinee means that the purpose of the test is to 
get an overall mi asme of mental ability, not to study peculiarities of 
mental functioning. For this teason also the group test, which provides 
a suitable senes of standardized tasks and obtains a measure of perfoi in¬ 
ane e on those tasks, yields all of the types of data which can legitimately 
be expected from intelligence testing for vocational or educational 
purposes. 

The J\ r echsler-Bellei>uc as an Intelligence Test. Studies of the 
Wcchsler-Bellevue published prior to 1945 have been summarized by 
Rabin (618) and by Watson (912). The trends revealed in these sum¬ 
maries are for the Wechslcr-Bellcvue scores and Revised Stanford-Binet 
to correlate from .78 to .93 when the groups are heterogeneous in age 
or mental ability, and about .62 when they are more homogeneous 
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(e.g. college freshmen). The verbal scale is uniformly more highly cor¬ 
related with the Revised Stanford-Binet than is the performance scale. 
The correlations with group tests are, as has generally been the case 
with individual tests, lower than those with other individual tests: for 
Army Alpha a coefficient of .74, for the Otis SA .425, and for the A.C.F. 
.48 and .53 arc reported. Wechslcr-Bcllevue I. Q.’s of superior individuals 
were found to be lower than those obtained on the Revised Stanford- 
Binet, while persons of little mental ability made higher scores on the 
Wechsler than on the Binet Scales, because the Wechsler has a smaller 
standard deviation. Rabin and Watson also deal with the clinical 
significance of part scores, but that topic is not relevant to our purposes, 
loom the trends reported abo\e one can conclude that the lesults of 
the Wechslcr-Bcllevue Scales agree with the results of other intelligence 
tests as well as can be expected. 

Occupational Significance of Total and Part Scores. From the point 
of \icwof the vocational psychologist, counselor, and personnel manager, 
the crucial question concerning this or any other intelligence test is- 
what evidence is there to help me interpret the test scores in tcnns of 
prospects of success in various types of work? The answer, lot the 
Wcchslcr-Bellevue Scales, is: practically none. Neither Rabin nor Watson 
located any studies of the occupational significance of Wee hsler-Belle\tie 
scores, and the writer has located only one published prior to lqjy. in 
which Altus and Mahler (15) reported significant differences in the 
Wechsler Mental Ability Scale (Form B) \eibal stoics of 2 jyti Army 
illiterates who had been employed in skilled or semiskilled occupations, 
on the one hand, or in unskilled occupations on the other. One can, of 
course, use the total or verbal scores in a general wav, by analogs. A 
person who has very superior intelligence on these scales would also have 
\eiy superior intelligence cm Army Alpha, and such people, we know 
from research with the latter test, tend to succeed in the higher piofes- 
sional and managerial occupations; similarly lor dull, normal, average, 
and other levels. But the possibility of such interpretations does not 
constitute a special advantage of the Wechsler-Bellevue in vocational 
and educational guidance. It is, rather, a means of salvaging and making 
useful the results of a test which would otherwise be useless in vocational 
guidance and selection. There are other tests of mental ability whose 
vocational significance is based on more direct evidence: they aie there¬ 
fore subject to less error in interpretation. 

For the part scores, or verbal and performance I. Q.’s, the answer 
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concerning ihe vocational and educational significance of the scales is 
even less equivocal. Neither Rabin nor Watson mention the occupational 
significance of verbal and performance I. Q.’s, although Rabin cites one 
study (ifi) of the relationship between total and part scores and achieve¬ 
ment in college. In this investigation Anderson and his associates re¬ 
ported a correlation of .41 between the full scale and the first semester 
grades of 112 college women, while the Verbal and Performance I. O.’s 
yielded correlations of .50 and .19 respectively. These compared with 
correlations for 1941 A.C.E. total, linguistic, and quantitative scores of 
.54, .5], and .59 (the data lor the 1940 lorm of the A.C.E. were .48, .48, 
and e{f»). Obviously, the Weehsler-Bellevue Performance I. Q. is of no 
\alue in predicting sue (ess in the first semester of a liberal arts college, 
and the performance items lower the validity of the verbal items in the 
total score. The Verbal I. Q. itself is no more adequate a predictor of 
success in the liberal arts than is a group test of intelligence such as the 
A.C.E. 

With such a paucity of evidence the use of the Verbal and Perform¬ 
ance I. Q.’s in the differential diagnosis of \ocational and educational 
aptitudes is clcaily unwarranted. To reason by analogy and interpret 
W( thslct-Bellc\ ue stoles as though they weie svnommous with linguis¬ 
tic and quantitati\e stoics on the American Council Psychological Exam¬ 
ination or with primary mental abilities stores on the Thurstonc tests is 
also unwan anted, although this seems to ha\e become a rather wide¬ 
spread practice among psyrhometrists and counselors. It is true that 
Balinshy’s factor analysis (49) isolated verbal and performance factors, 
the former consisting, at age 25-29, of digit-symbol, comprehension, and 
information items, and the latter of “spatial” items such as picture com¬ 
pletion, object assembly, and block design. But Anderson and others 
(i(>) ha\c shown that, although there is a moderate correlation between 
Wethsler Verbal anil A.C.E. Linguistic scores (r = .49 or .50), the relation¬ 
ship between Periot mailte and Quantitative scores is too low T (r = .31 
or .50) (or interpretation of one in terms of the other to he justifiable. No 
such data arc* as \ct a\ailable lor the Wethsler and PM A Tests. And we 
ha\e seen that differential educational diagnosis on the basis of either 
A.C.E. or PMA Test part-scores is still in the experimental stages. 

Use of the \Ye( hsler-Bellevuc Scales in Counseling and Selection. As 
the Wethsler-Belle\ue Scales are used in more and more studies e\idence 
upon wdiith to base judgment concerning the vocational and educational 
significance of part scores will presumably be forthcoming. In the mean- 
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time the objective psychologist, counselor, or personnel officer can only 
recognize that the use of anything more than the total or verbal score 
as a rough index of the educational and occupational level which the 
person in question may attain is unwarranted, and that, for most persons, 
this can be done at least as well and more economically by means of paper- 
and-pencil tests. 
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Promise and Proficiency 

IN COUNSELING young people concerning the choice of careers one 
is generally concerned with promise , that is, with prospects of success in 
a field in which the youth has as yet had no substantial training or ex 
perience. In selecting employees, on the other hand, the concern is more 
likely to be with proficiency , that is, with present ability to perform the 
tasks involved in a gi\en job. Proficiency, achievement, or trade tests are 
theielore generalh thought ol as instruments for the selection of person- 
n< I or for the evaluation ol the outcome of training, whether in school 
or on tlie job. Howevei, past achievement is olten one of the best indices 
ol hmue accomplishment, so that achievement tests can frequently be 
used as tests ol aptitude foi lelated tvpes ol activity. 

The dillerence between an aptitude and an achievement test thereto]e 
lies moic in its use than in its content. Ail achievement or proficient \ 
test is used to ascertain what and how much has been learned or how 
well a task can be perlonned: the locus is on evaluation of the past with¬ 
out relcience to the future, except lor the implicit assumption that ac¬ 
quired skills and knowledge will be useful in their own right in the futuie. 
A test ol achievement in aiithmetic is therefore a measure of masterv ol 
the essential piocesses of arithmetic and of ability to make certain types 
ol computations. A measiue of proficiency in tv ping is an index of abilitv 
to copy typewiittcn mateiial with speed and accuracy and therefore of 
abilitv to pel form certain tvpes of clerical duties to an employer’s satis¬ 
faction. An aptitude test is used to judge the speed and ease with which 
skills and knowledge, that is. proficiency, will be acquired. But, obvioush. 
proficiency in a given task may be an index of promise in a related task, 
and knowledge of certain types of facts may be indicative of facilitv for 
the learning of other types of facts. 

Therefore a test of arithmetic achievement may be a good index ol 
aptitude for algebra or for engineering, a test of tvping proficientv may 

117 
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be a good measure of aptitude for stenography, and a test of infoimation 
concerning recent developments in science may be a good predictor of 
success in medical training. Kacli such relationship is of com sc stiictly 
hypothetical until experimentally checked and found to be tmc. for even 
a good achievement test cannot be assumed to he a good aptitude test 
until it has been validated in the same manner as any other aptitude test. 
Achievement in arithmetic may prove to predict suuess in algebra, but 
have no relationship to engineering grades: one cannot take the relation¬ 
ship for granted, since what may seem like perfectly legitimate assump¬ 
tions in the field of prediction often prove unwarranted. An achievement 
test (or test of any type) can be used as an aptitude test onlv when there 
is a known relationship between the performance tested and the per¬ 
formance in which success is to he predicted. I his is the' essence ol apti¬ 
tude testing, the understanding of which takes all the mystery out of the 
subject. As it becomes more generally realized that aptitude testing is 
nothing more than the prediction of success in one perfoi mance b\ means 
of a measure of success in another performance known to be related to it, 
people in need of guidance will have more reasonable expectations from 
tests, business and industrial men will be more inclined to see their 
possibilities and limitations, and professional users of tests will hu\e more 
freedom to make legitimate use of them. 

Educational Achif.vkmlnt Tksin 

Educational achievement tests are of interest to us here onlv as indices 
of promise in vocational activities. Treatises of their use- in evaluating the 
results of instruction, as measures of educational progress, and 1 elated 
topics arc numerous (310.474,!»/)(>): unfortunatch thcic has been less 
study of their use in predicting educational success and still less of their 
value in predicting vocational success. 

In the prediction of educational success, educational achievement tests 
have been cflcctivcly used in the admission programs of colleges and 
professional schools. In most investigations they have been tried in com¬ 
bination with tests of scholastic aptitude and with high school averages, 
in order to determine the relative value of each type of predictor. In one 
such study at the University of Minnesota (930) it was found that high 
school rank was the best single predictor of sophomore achievement, but 
that a combination of the three types of indices was better than any 
single index. They may be similarly used by counselors in guiding stu¬ 
dents concerning the choice of college or professional school, the conn- 
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selee’s standing on a test in comparison with typical candidates for ad¬ 
mission being used as an index of the possible wisdom of the choice. 

It would be desirable also to ha\e data which would make it possible to 
counsel concerning the wisdom of the choice of a major field of study 
such as premedic a 1, engineering, business, and telated courses, but un¬ 
fortunate-])' the data which ate needed lor such applications o( achieve¬ 
ment test results ate a\ailable lot only a few institutions. Although the 
assumption that a weakness in science after high school should be taken 
as a ncgati\c indication lot a college majot in science has some justifica¬ 
tion, it should not be concluded that the lark of a given high school sub¬ 
ject will mean a weakness in a telated college* subject, for too many 
studies of the impottance of high school pierecjuisites in college admis¬ 
sions (i io) ha\e clemonsttated that one may do superior college w r ork 
icgatclless of backgiound in specific high school subjects. On the other 
hand, it seems like!) that the cjuality of woik done in a high school 
subject, whethet measured by the grade obtained or by score on an 
achic\emenf test, will, othet tilings being equal, be indicative of the 
qualitv of wot k that will be- done in a college course in the same subject. 

I he question is, what is known concerning the actual predictive value 
ol achievement tests? I his question will be examined in connection with 
the specific tests disc msec! in the- paragraphs which follow*. In brief, ex- 
pet ience has shown that achievement tests not only yield predictions o( 
college a\ei ages which are about as good as those piovided by intelligence- 
tests, but also give better differential predictions of success in specific 
subjects than do intelligence tests (117,701). 

The Iowa Phncmcnt Examinations (University of Iowa, 1924. 1930, 1941) 

These* were among the first educational achievement tests which were 
constructed, under StoddatcPs supervision, to cover the major subject- 
mattei fie lds for pm poses of differential prediction in college. First pub¬ 
lished in the mid-1 wen ties, they have been widely used and arc among the 
best const) in led and most thoroughly understood tests of their type; 
hence- their tieatment here. 

Aj)/)h(ability. Content, Administration , Scoring, and Norms. There are 
two series, one designed to measure achievement and assuming a year of 
high school woik in the subject, the other designed as a measure of apti¬ 
tude lot the same subjects. Both are designed for placement in college 
classes. Fields coveted are Chemistry, Physics, Mathematics, Fnglish, 
Ftench, and Spanish. The training or achievement series has been the 
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more widely used, and has been generally the more valid. The tests re¬ 
quire forty minutes each for the administration, and scoring is by means 
of a convenient stencil. Normative groups are large, consisting of more 
than 10,000 students in nine colleges. 

Standardization and Initial Validation. In constructing subject-mat¬ 
ter tests the attempt is generally made to obtain what Mosier (548) calls 
validity by definition by having experts outline the content of the field 
to be covered by the test and construct items which they consider to be 
representative of that content; these outlines and items are then checked 
by other experts in the same field, in order to insure representative 
judgment. Textbooks and courses of study were analy/ed in the develop¬ 
ment of outlines. The tests were then correlated with first semester marks 
in the nine co-operating colleges, the correlations ranging Irom .26 to .95, 
the mean for the Training Series being .Go and that lor the Aptitude 
Series .50 (759). Stoddard reported that the Iowa Placement Examinations 
gave better predictions of grades in specific subjects than did either high 
school grades or intelligence test scores, and he and Hammond (32G) 
found that the combined achievement scoies had more predictive value' 
than an intelligence test, although he also found that a single intelligence' 
test gives a better prediction of average college marks than does a single 
achievement test. 

Reliability . As might be expected in the case of carefully constructed 
achievement tests, the reliabilities are high: they ranged from .87 to .92 
(759)- 

Validity. Validation of the tests subsequent to their development was 
pursued most intensively at the University of Iowa in the late ’im’s and 
at the University of Minnesota ten years later. Hammond and Stoddard 
(326) used the tests in a number of engineering colleges, with Jesuits com¬ 
parable to those first obtained by Stoddard. The extremely high and low 
scores were found to be especially useful in singling out stude nts who 
were most likely to succeed and fail, respectively. For example, ol the 100 
highest and 100 lowest scoring students on the mathematics achievement 
test, only seven of the former and as many as 61 ol the latter failed the 
first semester course in mathematics. Woiking with engineering freshmen 
at Minnesota, Northby (5G9) found correlations of .55 and .70 between 
the same test and honor-point ratio for two different classes. In all the 
groups studied by Hammond and Stoddard the proportion of failing 
students in the top quarter of the placement examinations was less than 
10 percent, while from 28 to 58 percent of those in the lowest quarter 
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iailed the first semester’s work. Segel summarized research with these 
tests in 1934, finding a median correlation of .40 between the Eng¬ 
lish Training Test and college English marks and one of .54 between 
the Chemistry d raining Test and college marks in Chemistry. (701) 

Use of the Iowa Tests in Counseling and Selection. The data sum¬ 
marized above make it clear that the Iowa Placement Tests, especially 
the achievement tests, can be used as indices of differential achievement in 
college. Students or prospective students who show strength in a given 
subject-matter test are more likely than not to make good grades in that 
field, whereas those who make low scores despite appropriate preparation 
arc not likely to make good grades in courses in that subject. Those whose 
average 011 the battery of tests is high are likely to make high grades in 
their college work taken as a whole. The tests can therefore be used in 
counseling stude nts concerning choice of college and concerning choice 
of major field; they can also be used in selecting qualified students for 
admission to college and to departments or professional schools. It should 
be remembered, however, that the tests are not replaced annually by new 
forms as in the ease of the Co-operative lest Service tests, to be described 
below, for this reason the* Iowa tests are valuable for the knowledge they 
prenide concerning the predictive value of such tests, but are now ol less 
practical use than some of those de\eloped by active test construction 
organizations. 

The Co-operative Achievement Tests (Co-operative Test Service, periodi- 
rally) 

T he Co-operative Test Service began the publication of annual editions 
of achievement tests in the major school subjects early in the 1930’s, 
sponsoied bv the American Council on Education and operated under 
the leadership of Hen I). Wood and John C. Flanagan during the first 
decade of its existence. Jt is now part of the Educational Testing Service. 

Applnability. Content , Administration, Scoring, and Norms. Each test 
is designed lor use at a specified educational level, which may include a 
range of as many as three or four years. The content is kept up to date by 
the periodic publication of new editions, but earlier editions arc also 
available and are generally usable for several years (an important point, 
as examination of the content of some well-known social studies tests 
of pre-war vintage will reveal). Norms arc provided for large groups of 
students, and are made national and kept up to date by the large-scale 
testing programs in which the annual editions are used. The content 
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varies with the field covered by the test and with the level for which it is 
designed, the method of construction (discussed below) providing for 
adequate coverage. 

Of special interest in vocational and educational guidance and selection 
are the Co-operative Survey Tests (Natural Sciences, Social Studies, and 
Mathematics), the Co-operative Test of Recent Social and Scientific Devel¬ 
opments, designed for use with high school juniors and seniors, the Co¬ 
operative General Culture Test (History and Social Studies, Literature, 
and Fine Arts), and the Co-operative Contemporary Affairs Test, designed 
for use with college sophomores but applicable to other persons of 
college caliber and age. These tests have the advantage of providing not 
only comparisons of the achievement of the person being studied with 
that of other persons with similar backgrounds, but also a picture ol the 
relative strengths and weaknesses of the counselec in the subjects tested. 
The Survey Tests, being based on the content of high school courses, are 
useful in counseling high school seniors and entering college freshmen 
concerning the choice of college majors; the other tests, less closely ;m- 
choied to specific courses, arc useful in helping non-science majors un¬ 
derstand their special strengths and can be used as something of an in¬ 
terest test, for they lcflect to a considerable extent the subjects in which 
the student has been interested and upon which he has kept informed. 

Administration is simple, and scoring can be done* c ither by hand or 
by IBM test-scoring machines; in either case, the use of special answer 
sheets makes for economy of materials and c ase of scoring. 

Standardization and Initial Validation. As in the case of the* Iowa 
tests, the Co-operative Achievement Tests are developed by subject-matter 
experts who work with test technicians; test outlines ate based on anahscs 
of courses of study and textbooks, and items are checked b\, both tvpes of 
experts. The first type of validity achie ved is, thcreloie, validity b\ defi¬ 
nition. Further validation is occasionally carried out by con elating test 
scores with high school or college grades; these correlations ha\e gener¬ 
ally been moderately high (.30 to .50) for appropriate subjects (lj.j). 

Reliability. The reliability coefficients vary slightly with the test and 
form, but have generally been .90 or higher, as one would expect in the 
case of subject-matter tests constructed by experienced technicians. 

Validity. It has already been stated that the validities of the Co-oper¬ 
ative Achievement Tests for the prediction of grades in related subjects 
range from about .30 to about .50. When scores made on a battery of 
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achievement tests such as the Co-operative General Culture Test are 
combined, higher con (Nations are reported, in one study (24) the validity 
coefficient for the latter test and average grades for the first two years in 
college was .53. 

More striking still are the mean General Culture scores made by stu¬ 
dents in different major fields, which show that students of journalism, 
religion, and law made above average total scores, probably reflecting 
their broader interests, whereas engineers have apparently a much more 
restricted range of interests and make significantly low general culture 
scores. More important than pre-occupational differences in total general 
information are, of course, the diffeiernes in patterns of scores on the 
carious subtests. Analyses (679) of these show that students who later be¬ 
came medical students had made generally high scores as freshmen. 
Dentistry students had excelled in mathematics and science but not in 
other areas. Journalism students reversed this pattern. Library Science 
students were high in English blit mediocre in other fields. Business 
students were characteristically high in mathematics but low in English. 

It would be desirable to ha\c data showing the relationship between 
patterns of achievement on tests such as the Sur\cy and General Culture 
Tests, and choice, achievement, and satisfaction in different types of 
woik. One would expect, for example, that social workers would be per¬ 
sons who, in college, made their highest scores on tests of the social 
studies, and that successful engineers are those who, on entering college, 
showed special strength on tests of achievement in natural sciences. But 
no data such as these ha\e come to the writer’s attention. 

Cse of the Co-operative Tests in Counseling and Selection. In \iew 
of the moderately high relationship between scores on these subject- 
matter achievement tests and grades in appropriate courses, they may 
well be used in helping students evaluate their prospects of success in 
various major fields in high school and college, in placing students in 
sections for which their background qualifies them, and in selecting stu¬ 
dents for courses of training which emphasize mastery at a higher level, 
of the same type of subject matter as that covered by the test. There are as 
yet no direct, objective data to justify counseling concerning the choice of 
an occupation on the basis of educational achievement test scores, but 
insofar as achievement on a test is related to grades in a professional 01 
vocational school, and grades in such a school are related to entry into 
or success in the occupation for which it prepares, it should be safe to 



154 APPRAISING VOCATIONAL I I I NESS 

deduce that some educational achievement tests do have ai least indirect 

predictive value for some occupations. 

The Tests of General Educational Development (Science Research As¬ 
sociates, 1946) 

The Tests of General Educational Development were constructed foi 
the United States Armed Forces Institute under the direction of E. F. 
Lindquist; another series under the same title was developed by Lind¬ 
quist at the University of Iowa. Both series are obtainable from Science 
Research Associates. As the USAFI series has the most comprehensive 
norms and has been most widely used with returned servicemen it will 
be discussed here. 

Applicability , Content , Administration , Scoriiig , and Norms. The 
GED Tests were designed for use at high school and college levels, a 
separate battery being designed lor each level. The high school battery 
covers five areas: correctness and ellectiveness of expiession, interpre¬ 
tation of leading materials in the social studies, natural sciences, and 
literature, and geneial mathematical ability. At the college level there 
are four tests, mathematical ability not being coveted. The objecti\c is 
to measure understanding rather than factual knowledge. 'I he tests are 
power tests, with tw r o hours allowed for each test. IBM answer sheets 
make possible stencil or machine scoring. Norms are available for six 
geographical regions (an advance over national norms) and for the coun¬ 
try as a whole, and for college students in three types ol institutions 
classified according to freshman mental level. 

Standardization and Initial Validation. The procedures used in de- 
\eloping the GED 'Tests weie similar to those used in the construction 
of other achievement examinations, with the exception that art attempt 
-was made to measure understanding lather than factual knowledge in 
\icw of the lapse of time since many service-men had attended school. 
This trend is a wholesome one in achievement testing, in view of the 
common tendency to overemphasize factual knowledge; it should not 
result, how r cver, in failure to measure the mastery of factual knowledge 
which constitutes the basic tool of many subjects. 

Reliability. In view of the attempt the test authors made to measure 
understanding rather than knowledge of facts, it is perhaps important to 
note that the reliabilities of the tests are not reported. 

Validity. The validilv ol the GED 'Tests has been studied primarily 
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in relation to the prediction of success in college. Crawford and Burnham 
(179) administered the tests to 135 freshmen (veterans and non-veterans) 
at Yale University, finding a correlation of .72 between total scores on the 
GEI) Tests and the College Entrance Examination Board Me sis, and 
correlations of .56 and .53, respectively, lor each of these tests with first- 
term freshman grades. Correlations between part scores on the CE 1 ) 
"Jests and Ireshinen maiks ranged from .36 to .51, the former being for the 
natural sciences and the latter lor the expression test. Dyer (227) tested a 
group of ]j,j Harvard students, about one-hall of whom were Ireshinen 
and the balance in the three other classes. He also lound that the total 
score provided a reasonably good prediction (r = .,jf>) of college grades. 
I11 a third study conducted at the Uni\ersity of Minnesota, Callis and 
Wrenn (131) obtained a correlation of .72 between total GEI) score' and 
honor-point ratio. M he significantly higher figure may be a chance error 
related to the small numbe r ot cases (X = 5b), or may perhaps be clue to 
a greater lange of ability resulting from less stringent initial selection in 
a state university. The authors’ comparison of the two suggests the latter. 
Like the other studies, this one suggested that the Expression and Social 
Studies Tests are the best single tests in the battery for predicting o\erall 
success in college. 

I he value ol the G 1 I) M’ests in placing students in advanced courses, 
one* of the principal uses to which the tests were intended to be put. has 
been ascertained onlv bv Iher (227). He lound that, with curricula and 
promotions such as those at Harvard, the tests were of no value in the 
advanced admission of students in either scientific or non-scicntiiic au¬ 
ricula. Dyer also reported that patterns of GEI) scores tended to agree 
with patterns of interest as shown by field of concentration. 

Use of the USAFI GEI) Tests in Counseling and Selection. MMie studies 
so far published indicate that the GEI) M’ests are scholastic aptitude 
rather than scholastic achievement tests. In view of the authors’ attempt 
to measure understanding rather than factual knowledge this finding 
should not be surprising. They can therefore be used in counseling stu¬ 
dents concerning the choice of colleges, and in selecting students for ad¬ 
mission. They have some dillerential value for science and non-science 
majors, just as do achievement tests in appropriate subjects, and just as 
the part scores of scholastic aptitude tests show promise of doing. 1 hey 
have, finally, the advantage of not looking like or being labelled as in¬ 
telligence tests, which may make them more acceptable for use with 
some candidates for college entrance. 
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Vocational Proficiency Tests 

Unlike tests of educational achievement, vocational achievement tests 
cannot well be used as measures of aptitude in students: vocational pio- 
liciency tests measure skills or knowledge already acquired, rather than 
ability to acquire them. The degree of skill or knowledge in one occupa¬ 
tion cannot olten be used to pi edict the degree of skill which will be 
acquired in another occupation, even when the* latter can be considered a 
higher level occupation of the same general tvpe: skill as a machinist is 
an inefficient index for the prediction of success in engineering, lor train¬ 
ing in the one occupation takes as long as training in the other and the 
varying degrees ol aptitudes needed in each can be more easily measmed 
by other types ol tests. On the other hand, vocational proficiency tests 
can be used as indices of the prospects ol success in a job when dealing 
with trained candidates foi a job. Such tests aie therefore widely uselul 
in selection, but in counseling only with marginal workers who may need 
to be encotiiaged to change their field of endeavor. Because the\ aie 
largely a selection technique, and because companies with good selection 
de\ices of their own developing generally prefer to keep them horn be¬ 
coming known to others, there is little published mateiial concerning 
specific \ocational proficiency tests. Apart bom a lew stenogiauhic tests, 
the trade* tests developed by the U.S. Army, Navy, and Emplovment Serv¬ 
ice aie the* most generally known. 

The Black stone Typewriting Test (World Book Co., qpqp 

Although one of the first tests developed to measure proficiency in 
typing, this test is still a standard test in this field. Developed for 
testing students’ proficiency in courses, it is also useful with employment 
applicants. 

Applicability , Content, Administration, Scoring, and Norms. The 
test can be used at and above the high school level, with persons who 
have had some training in typing. It consists of a typical business letter to 
be copied by the examinee. It can be given in group form, requiring only 
three minutes. The score is the number of errors and corrections. Norms 
are based on more than 2000 cases with from five to 20 months of 
instruction. 

Standardization and Initial Validation. Typical business letters were 
analyzed to determine the average number of strokes per word and vari¬ 
ous forms were tried with varying time limits and scoring methods. The 
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final forms distinguished dearly between students with differing amounts 
of training. No validation against success on the job was attempted, since 
it was thought of as an educational achievement test. 

Reliability. The average inter-form reliability was reported as .93 in 
the manual, the students in question having had twenty months of train¬ 
ing. 

Validity. There seems to be a tendency to assume that tests such as this 
arc valid, examination of the content showing that it is a typing test. It 
would still be desirable to know what the relationship is between speed 
and accuracy of transcription in a relatively artificial test situation and 
speed and accuracy in a routine work situation. 

Use of the lilac kstone Typing Test in Selection. Each organization 
using a test such as this should empirically determine its own cut-off 
scores In ascertaining the range of scores on its employees and setting the 
critical minimum score at a point which eliminates those who are too 
slow or inaccurate. 

The lilac kstone Stenography Test (World Book Co., 1923) 

This test is designed to measure mote than ability to take and transciibe 
dictation, and to include English, office practice, and related abilities 

Applicability f Content, Administration , Scoring , and Norms. 1 'his 
test also was designed for use at or above the high school level. 1 'hc Eng¬ 
lish test measures knowledge of grammar, punctuation, capitalization, and 
spelling b\ means of sentences in which the type of error made is to be 
indicated; tluce tests measure proficiency in hyphenating, alphabetizing, 
and abbre\ kiting; two te sts cover knowledge of office practice and business 
oigani/ation; and one test measures ability to take dictation at a fixed 
tate and to tiansciibe two letters on the typewriter. This is a group test, 
but the two letters to be dictated are to be chosen from the manual on the 
basis ol appropriateness to the persons being tested. Dictation time ranges 
from one to three minutes, transcription time is twelve minutes, and 
other parts require 33 minutes. Norms are based on 1000 students with 
varying amounts of training. 

Standardization and Initial I'alidation. Correlations of .62 and .79 
with efficiency ratings for groups of 37 and 49 stenographers are reported 
in the manual. These seem remarkably high, but the data do not permit 
cwaluation of the adequacy of this phase of the work with the test. 

Reliability. Idle inter-form reliability reported in the manual is .88 
for moo subjects. 
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Validity. No validity data have been located in the literatuie, although 
they would be even more desirable in the case of a test such as this than 
in the case of the typing test, since it purports to measure more than one 
phase of stenographic work. 

Use of the Blackstone Stenography Test m Selection. In the case of 
this test also, local critical minima should be established with the aid of 
employee evaluation techniques. Such minima must, of course, van with 
the available supply of workers. 

The Seashore-Ben net t Stenogiaphic Pioficiemy Tests (Psychological Cor¬ 
poration, 19J‘>) 

The Seashoi e-Bennett tests are a new set ics of phonographicallv 
lccorded stenographic piofiriencv tests, two foims designed foi use in em¬ 
ployee selection b\ business firms, the otheis foi use in schools and 
employment agencies. The use of recordings of business letteis was 
resorted to as a method of standardizing the \oice and rate of dictation. 

Applu ability , Content , Administration . Scoring, ami Xorms. These 
tests have, like otheis in this category, been designed for use at the high 
school le\el or abo\e, with persons who have had some training in short¬ 
hand and tvping. They consist of phonographicallv tecorded letteis. five 
letters (lour discs) to each iorm of the test. 1 wo letters aie short and slow, 
two are oi medium length and aveiage speed, and one* is long and lapid. 
Administration requires about fifteen minutes, with another halt hour 
lor transcription. Complete scripts and reproductions of good and pool 
transcriptions are provided foi use in scoiing. Norms arc* not provided, as 
it was expected that they will vary consideral>l\ horn companv to com¬ 
pany, and nation-wide norms could not be collected. Distributions ol 
scores for several companies are provided in an article published subse¬ 
quently to the manual (('>97). 

Standardization and Initial Validation. In one sense, this test can 
depend on internal evidence ol validity, loi it involves shorthand and 
transcription. It is virtually a lile-situation test. Preliminary validation 
studies have been reported, however, showing correlations ol . |<j and .hi, 
respectively, with supervisors’ ratings of general value (combined ratings) 
and stenographic ability ((>97). 

Reliability. When scores on two ol the letteis were correlated with 
scores on the other three, the- reliability coefficients were .80, .83, and .91. 

Validity. The tests are too new for other validation studies to have* 
been completed and published. 
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Use of the Seashore-Rennet t Tests m Seledion. The availability of 
alternate forms of these tests makes possible theii use both in initial 
selection and in the evaluation of process foi promotion. Local norms for 
both purposes should be developed, as job 1eejuirements vary Iroth within 
an organization and among organi/ations. It may be found desirable, for 
example, to start new stenographets in ceitain departments but not in 
others, transferring them to these latter positions after promotion tests 
demonstrate that the) have attained the pioficicncv needed in the more 
demanding posit ions. 

The Fheell-Fondkes Bookkeeping Test (World Hook Co., U)ijN) 

This test was developed lot the evaluation ol piogiess in bookkeeping 
courses, and, secondarily, lot judging applicants foi positions. It coveis 
the first two semesters of bookkeeping. 

Applicability , Content, Admnustuit i< >u , Sioung, and Xorms. It may 
be used with high school students and adults who have had some training 
in bookkeeping. Two tests are available for the two semesters of book¬ 
keeping. I here are nine parts covering theory, journalizing, classification, 
adjusting entries, closing the- ledger, and statements, \drninistration time 
is about one bom Norms are based on about 2r,o students itr each se¬ 
mester. 

Standardization and Initial Validation. The test covers standard 
course material and, like most achievement tests, depends upon bice 
validity and cate in construction. 

Reliability. Inter-form reliability is .82 and .87 for the two levels of 
the test. 

Validity. No studies reporting field validation have been located bv 
the writer. 

Use of the Fhwell-Foudkes Bookkeeping Vest in Selection. The nature 
of its content, and its reliabihtv, suggest that this test might be effective 
as a means of checking the- masters of bookkeeping fundamentals in inex¬ 
perienced emplovmeat applicants. Local norms are desiiable, in view ol 
variations in requirements and opportunitv lor learning on the job. 

Interview Aids and Tiade Qjiestions (827) 

During both World War I and World War 11 extensive use was made 
erf trade tests in the rapid classification and assignment of military person¬ 
nel. The first trade tests wete described in detail bv Chapman (15.1): those 
developed by the Tinted States I'mplov ment Service have vet to be de- 
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scribed in detail. Between the two World Wars the technique of tiade 
testing was further developed at the Cincinnati Employment Center (#27), 
where the trade questions were revised and brought up to date. Subse¬ 
quently the United States Employment Service recognized the valm to 
this approach, and developed trade questions for use in its uoik as 
described in Stead and Shartle (750: Ch. 3 and pp. 1 r, 0 -i() 2 ). Been ol 
its general availability, Thompson’s Cincinnati work (N 27 ) is desc H*d 
here in order to illustrate the technique; it should be stiessed. how< \ci 
that occupational changes which have taken place since* the mid 1 h 1 1 1 it-* 
make local and up-to-date revisions such as those oi the* l kf N i)i'( osn\ 
before his trade questions are put to practical use. 

Applicability , Content , Administration, Scoring, and Xorr/n. T * s< 
trade questions were designed lor use in emploMurnt oflices and stand 
ardi/ed on experienced craiismen, but they mav also be used with high 
school students who have had some trade training and a u seeking (licit 
first employment. The book consists of questions concerning the tools, 
materials, and methods of 131 trades ranging from \nnnom.i Pipeline! 
and Armalme W inder to Wood Finisher and Woodmill W’oikii I nch 
test contains from 1 5 to 25 cjuestions, such as: “W hat kind oi w< Id has 
a boilet tube?” (for Boilermaker), the correct answei to which is “lap 01 
nobble." I lie examiner reads the questions aloud, and they are answered 
01 ally. Ihe examiner notes the answei s. 1 'his jnocedtire has the ad\ antage 
of appealing to manual worke rs mojc than would a paper and pent il lest. 
The liumbei o 1 right answers is convened into a decile* lating and a 
proficiency rating ranging from novice to expert. Norms aie occasionally 
based on small groups, the work having been published while in process 
ol completion. 

Standardization and Initial Validation. Because of the tendenev to 
rely on internal validity in achievement and pioficiencv tests, and because 
of early publication oi the book, statistical evidence ol validitv is lacking. 
Howevei, the fact that questions weie developed with the aid ol spec ialists 
in each field, and their ability to difleientiale novices from joinnevmen 
and experts, constitute evidence- ol a sort. 

Reliability. Data are not presented on the reliability of these trade 
questions. Stead and Shartle (750) reported reliabilities of .79 to .93 for 
the USES tests. 

Validity. No later studies casting light on the validity of these ques¬ 
tions in selecting woikcrs have come to this writer’s attention. 

Use of the Interview Aids and Trade Questions in Selection. Experi- 
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cnee has repeatedly shown that a few well selected questions concerning 
the tools, material, and methods of his claimed trade are likely to weed out 
the ill-trained or inexperienced worker who wants to bluff his way into 
a desirable job (a very real problem in military classification) and com¬ 
mand the respect of the expert who knows his craft. The use of trade 
questions in employment offices therefore seems amply justified, even 
though controlled experiments and quantitative data are lacking. 
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CLERICAL APTITUDE: 

PERCEPTUAL SPEED 

Is There a Special Clerical Aptitude? 

A COMMONLY used classification of clerical jobs (93) describes three 
phases of clerical work: doing the work, checking it, and supervising it. 
Job analysis has shown that these are levels as well as t\pes and that each 
of these levels ol clerical work requires the making of more decisions than 
the le\el immediate!} below it. But planning and decision-making implv 
intelligence, aptitude for abstract thinking, a requirement bv no means 
confined to clerical work. \ his being the case, one might well ask whether 
there is actually such a thing as clerical aptitude. Material discussed in 
the chapter on intelligence shows that general intelligence is indeed a 
factor in success in clerical work, the minimum desirable I. O. being qr, 
or 100, and the minimum requirements rising with the level of responsi¬ 
bility. When promotability is a factor to be considered in the counseling 
or selection of potential clerical workers, intelligence should be heavily 
weighted; when, on the other hand, success in a routine clerical job is 
in question, intelligence exceeding the minimum requirement is all that 
is needed, other factors then being the clescisive ones. What these other 
factors are will be seen below. 

Job analysis suggests other aptitudes which should be important in 
clerical work. In routine clerical work, at least, one would expect speed 
and accuracy in checking numerical and verbal symbols to be a charac¬ 
teristic of the successful worker. Bookkeeping, typing, filing, and other 
record-keeping jobs invoke constant checking or copying ol words and 
numbers, calling lor perceptual speed and accuracy on the part of the 
employee. It will be seen below, in the discussion of the Minnesota Cleri¬ 
cal Test, that this hypothesis is borne out by research; it will also be seen 
that speed in perceiving numerical and verbal similarities is so much more 
important in clerical than in other occupations that there is some justifi¬ 
cation for referring to this ability as clerical aptitude. 

162 
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Another aptitude which job analysis suggests should contribute to 
success in clerical activities is motor skill or manual dexterity. Standard 
works on aptitude testing such as that by Bingham (94.152) list motor 
skill as 011c of the aptitudes required in clerical work, with the obvious 
justification that such work involves frequent and rapid manipulation of 
papers, cards, pencils, typewriters, and other office tools and machines. 
As will be seen in the next chapter, which deals with manual dexterities, 
the only evidence from aptitude tests to support this claim lies in the 
superior scores made by clerical workers on line manual dexterity tests. 
No studies ol the relationship of such tests to clerical success are known, 
and gross manual dexterity has actually been demonstrated to be unre¬ 
lated to clerical success. It seems that other aptitudes such as intelligence 
and perceptual speed are so much more important that anyone who has 
average or better manual dexterity has enough motor skill for success. To 
put it in other terms, the critical score fen' manual dexterity in clerical 
work is so low that almost everyone of average intelligence surpasses it. 

Finally, analysis ol the work ol office clerks has suggested that profi¬ 
ciency in language and in arithmetic is essential to success. These* are of 
course not aptitudes in the strict sense of the term, but only in the sense 
that such profit ienev ma\ be prognostic of success on the job. However, 
it has been seen that 1 lie validity of clerical proficiency tests has not 
actually been demonstrated against external criteria, legitimate though 
the* assumption may appear. 

The answer to our initial question is, then, that two or more aptitudes 
contribute to success in clerical work, and that one of these appears to 
be peculiarly important, partially justifying referring to it as clerical 
aptitude. Although perceptual speed as measured by other techniques is 
important in other occupations (336), it has been shown that there are 
two perceptual factors, one involved primarily in the perception of space 
relations, the other primarily in clerical (numerical and verbal) tasks 
(735: r r,i>). l ire latter’s importance in clerical work is such as to warrant 
its treatment as clerical aptitude. The balance of this chapter will there¬ 
fore be devoted to a survev ol perceptual speed as clerical aptitude. 

Tv ricAL Tests 

Bests measuring perceptual speed by means of numerical 01 veibal 
symbols have long been a standard part ol the armamentarium ol the 
psychologist, a number of them having been included in the grandfathei 
of measurement texts. Whipple’s i\famial (919). It was not until the days 
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ol more refined statistical procedures and validation against occupational 
success, however, that the peculiar value ol these tests in \ocational 
guidance and personnel selection became obvious. The idea was tried out 
and validated as the Minnesota Vocational Test for Clerical Workers at 
the Minnesota Employment Stabilization Research Institute by Paterson 
and Andrew (see below). The Psychological Corporation’s General Cleri¬ 
cal Test and the O'Rourke Clerical Test incorporate items ol the same 
type, together with others which measure numerical and verbal abilities 
more complex, than mere perceptual speed. The Minnesota test is the 
only clerical aptitude test which has been subjected to widespread and 
careful study and validation. It is therefore the only instrument in this 
category to be discussed in detail. 

The Minnesota Clerical Test (Psychological Corporation, icgpj, ipjb) 

This test was the one test construction project carried out b\ the 
Minnesota Employment Stabilization Research Institute, which found 
that its needs lor tests of intelligence, manual dexterity, mechanical apti¬ 
tude, spatial visualization, and personality were laiily well met by the 
then available instruments. It was so easv to administer and score and so 
thoroughly studied that it immediately became one of the most widely 
used aptitude tests. It was originally called the Minnesota Vocational Test 
for Clerical Workers. 

Applicability; Effects of Age, Training , and Experience. The Minne¬ 
sota Clerical Test was designed and standardized for adult use, the adult 
group including girls of 17 and above and boys aged 19 and above. It was 
then assumed that the test would be ecjually applicable to bovs and girls 
of high school age, but data for age and grade norms were subsequently 
compiled (678). These show an increase in scores with age and giade, the 
median Number-Checking scores for 1 }, 15, 16, 17, and 18 year-old bovs 
being 89, cjj, 100, roj, and 102. As Schneidler points out, the sample is 
not perfect for age norms, as it includes only those who happened to be 
in grades 8 through 12: the duller 1 j year-olds were therefore not in¬ 
cluded, and the brighter 18 year-olds had already graduated from school. 
However, the age and grade norms resemble each other enough to give 
one some confidence in both sets. 

Unfortunately, SchneidleTs analysis is not sufficiently refined to answer 
the important question concerning the applicability of the lest, to wit, 
that concerning the influence of age 1 on scores. Her data reveal an increase 
in the mean scores of increasingly higher age groups, but they do not 
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indicate whether this increase is due to the selection which normally 
operates in high schools to eliminate the less intelligent as they get older, 
to the maturation ol dental aptitude with age, 01 to the effects of train¬ 
ing and experience in high school which involve practice in speed and 
accuracy of perception. She does piovidc intelligence test data lor her 
sample, hut these are in terms of scores which are insufficiently described 
to pel mil interpielation. II they are intelligence quotients, then there 
was no selection on the basis of intelligence, lor the scores remained 
relatively constant throughout the lour years ol high school. This would 
indicate that the innease in clerical scores with age is due either to 
maturation 01 to expedience. While it would be* surprising to find so 
simple a skill matin ing as late as the last two years ol high school, it would 
be still mote surprising, in \icw ol data to be presented below, to find that 
experiences as dissimilar to that of the test as high school work affect 
the- test scores. There are clearly some important problems for blither 
investigation here before* this simpIe-appeai ing test is really understood. 
In the meantime it must be* used with caution at the adolescent level. 

An attempt was made bv Kingman (pjj) to ascer tain the eflect of a year 
ol ,\ (hoolm^ on the test. His subjects were a group of 207 commercial 
high school girls, who showed significant gains in scores on both pails of 
the Minnesota Clerical Test after a year of high school commercial educa¬ 
tion. As the 90 oldest did not differ significantly lrom the 30 youngest, 
Kingman concluded that the increase was due to training rather than 
to maturation. To this writer the conclusion does not seem warranted, 
in view ol Schneidlcr's prior finding that scores increased with age in all 
tvpes of high school students. It is regrettable that Kingman used no 
control gioups. 

The pioblem ol the effect of experience, as distinct bom matin ation. 
is not confined to the* use of the test with adolescents. Andrew’ (21 > 
investigated it in the original studies with the test, administering it 10 
iV) cleiicallv experienced women aged 17 to 29 and correlating scores 
with amount of experience. I he correlations for the Numbers and 
Names Tests were .30 and .31. respectively. This might be taken as 
indicating that clerical experience has some effect on Minnesota Clerical 
Lest scores, were it not for one problem of sampling: the less experienced 
group could normally be expected to include some relatively unselected 
woikers of low aptitude who are normally weeded out dining the fust 
\eat 01 so of experience and who shift to light lac ton, sales, or othei 
11011-clerical employment. If this group could have been sifted out, in 
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retrospective analysis, it might have left a group in which “true” clerical 
aptitude was equally distributed and in which the correlation between 
Minnesota Clerical Test scores and length of experience was zero. In 
another study (22) Andrew administered the test to 28 clerically in¬ 
experienced adults before they embarked upon a five-month training 
program in clerical work, and rcadministered it again at the end of the 
training period. The dillerence between pre-training and post-training 
scores was not significant, leading to the conclusion that training in 
clerical work had no effect on the scores of the Minnesota Vocational 
Test for Clerical Workers. 

A further study of the effects of experience was made by Hay and 
Blakemore (360) in a large bank. They tested 229 inexperienced and 241 
experienced women applicants lor clerical employment. The experienced 
group averaged 7 points higher on the Names, and 7.5 points higher on 
the Numbers Test, the equivalent of less than .25 sigma or 7 percentile 
points at the mean. These differences arc statistically significant, but in 
practice thee are not likely to piove vital, especially if the 1 leasoning 
applied to Andrew’s first study is valid and applicable here. Indeed, it is 
highly likely that lla\’s inexperienced applicants included some* women 
of little true cleiical aptitude who would in due coin sc be weeded out 
and who would not subsequently be in the market as exponentcd ap 
plicants for clerical employment. II this is so, then it would be all the 
more legitimate to consider the small but statistically significant differ¬ 
ence reported b\ Ha\ and Blakemore as psychologically and practically 
insignificant. The authors found negligible correlations between scores 
and length of experience in clerical work, further supporting this con¬ 
clusion. 

In sunmian, it seems necessary to conclude that it has not been 
demonstrated that training or experience aflect scores on the Minnesota 
Clerical Test. The preponderance of evidence from se\cral ambiguous 
studies, together with the clear-cut findings of one study of the efiect ol 
training on scores, indicates that the test is relatively independent of 
training and experience in clerical work. 

Sex differences have been found to be significant (22,681). This means 
that although the test is usable with both sexes, separate norms are 
needed. Women tend to be superior to men in general, although in the 
same job men and women are found to be equal in cleiical aptitude, 
indicating the effects of selection. Age, however, has no efiect on scores 
according to evidence compiled with adult groups by the same authors. 
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Content. The Minnesota Clerical Test consists of two parts, the 
Numbers Test and the Names Test. The former is made up of a series 
of pairs of numbers, in some of which the members are identical and in 

some diflcrcnt, as in the following samples: 7G39-7G93, 6291-6291. 

7 he examinee must mark the pairs in which the two members arc- 
identical. The Names Test embodies the same principle, with only minoi 
differences between the members of non-identical pairs, e.g.: Smith and 

Co.-Smyth and Co. The task is obviously simple and routine in natuie 

although exacting when speed and accurac) are required. 

Administration and Scoring. The test is designed for group adminis¬ 
tration and requires fifteen minutes working time. Examiners need to 
make sure that subjects are wot king on the proper part of the test, and 
that they draw a line, as directed, under the last pair at which they looked 
before the direction to stop was given. Scoring is by means of a stencil, 
and involves a correction for wrong answers. Scores are thus a combina¬ 
tion of speed and accuracy, and have been criticized as such by Candee 
and Blum (13 j), who developed a scoring system which yields separate 
scores for speed and accuracy. Their contention is that accuracy is more 
important than speed in clerical work, a slow accurate worker being 
preferable to a fast inaccurate worker. Such a scoring method might be 
clesiiablc when the criteria permit evaluation of the relative importance 
of each factor, but in most situations they are not so refined. It seems 
probable that the combined score provided by the test authors is generally 
to be preferred for occupational use, giving as it does some weight to 
each factor. The great majority maintain a fair degree of accuracy, know¬ 
ing that it counts together with speed, and the important individual 
differences revealed in the test are differences in speed (171). If an 
examinee lowers his accuracy level in order to increase his speed, the 
wrong-penalty minimizes the gain. 

Xonns. "The manual provides norms for gainfully cmploved adults, 
clerical workers in general, and various specific clerical occupations 
such as shipping clerks, routine clerical workers, bank tellers, and ac¬ 
countants and bookkeepers. The general adult group is the standard 
sample used in the Minnesota Employment Stabilization Research 
Institute, a cross-section of 500 gainfully employed persons in the Twin 
Cities, so selected as to be representative of the urban national occupa¬ 
tional distribution. The norms for the specific clerical occupations are 
unfortunately not as satisfactory, consisting as they do of small groups 
of relatively undescribed workers in each category. The accountant and 
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bookkeeper group, for example, includes 29 men; there is no indication 
as (o how many were accountants and how many bookkeepers, a factor 
which has a bearing 011 their probable intelligence level, nor is evidence 
presented to show how representative they are of accountants and book¬ 
keepers in general. The 181 women stenographers and typists illustrate 
the same problem: how many arc secretaries, how many ate stenog¬ 
raphers, and how many are typists? The norms do not tell one. Despite 
these defects, the original norms seem to have some validity, lor they 
are not excessively out-of-line with those for the equally small groups 
studied by the United States Employment Service (Figuies .\ and r f ). 

F01 innately research has not ceased with the compilation o 1 the 
original sets of norms just referred to (and it should be stated parenthet¬ 
ically that at the time of publication, a meie fifteen yeats ago, these 
norms were unusually comprehenshe). Norms and critical scores ha\e 
been made available for mote than 700 men and 1 joo women bank 
employees by Hay and BJakcmoie (359), and adolescent norms lia\r 
been published by Schneidler ((>78) and discussed elsewhere in this 
chapter. Both are included in the reused manual. I he median scores lot 
clerical workers in the Philadelphia bank studied by Hay and Blakemoie 
were about ten raw-score points below the mean reported by the Min¬ 
nesota studies for routine clerical workers, and equivalent to the 87th 
percentile (men) and 70th (women) when compared to the general adult 
sample of the Minnesota project. TIa\ found a critical score of 130 
(Numbers) useful in selecting machine bookkeepers: this is about the 
median for routine clerical workers according to the* Minnesota norms, 
and 19 raw-score points below the Number-Checking median lor Min¬ 
nesota office-machine operators. Since it does not seem likely that Phila¬ 
delphians are inferior in perceptual speed to Minnesotans, and since 
the former sample was collected o\er a period of \ears which included 
both depression and prosperity, whereas the latter was taken at the depth 
of the depression when infeiior clerical employees had been released 
(22), it seems likely that Hay’s norms are more representati\e. This is 
confirmed by a USES study cited below. It is noteworthy, however, that 
the critical score which Hay established for his concern was almost 
identical with the median for the employed routine clerical group in the 
Twin Cities and for one of the USES samples. The median and critical 
score on the Otis S.A. "Test for Hay’s clerical workers being an I. Q. ol 
100, and the first cjuartile an I. Q. of 95, it seenrs legitimate to treat his 
sample as about the same as a routine* clerical group. The writer is in- 
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dined to wonder about the wisdom of critical scores which, like Hay’s, 
<ne the same as the median of a successfully employed group; a Number 
Test score of 112, about one sigma below the mean, would normally be 
more practical. 

Although not presented primarily as norms, the USES data published 
by Stead and Shat tie (750) and reproduced in Figures 4 and 5 piovided 
a valuable source ol norms ior the Minnesota Clerical Test. In these 
ligmes the means and standard deviations of the raw scores made by 
various types ol cleiital and semiskilled woikers on both Numbeis and 
Names Jests are graphically presented. Although the numbers ate small 
the data agree reasonably well with those of the Minnesota studies and 
with Hay s; guidance in terms ol the limits suggested by the first sigma 
points ol the USES data, and, lacking local norms, selection on the 
same basis, will piobably not be lar wrong. 

I he Minnesota Clerical I cst lias now been sufficiently widely used 
lor more adequate norms to be available lor specilie clerical occupations. 
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19 

40 

43 

53 
23 
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46 
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Bookkeeping-Machine Operator 1 -.09 
Invoice Typist -.08 
Calculator Operator V .02 
Card-Punch-Machine Operator II .31 
Coding Clerk I .38 
Calculator Operator 11 .53 
Card-Punch-Machine Operator 1 .33 
Toll-Bill Clerk -.07 
Calculator Operator IV .10 
Index Clerk -.39 
Bookkeeping-Machine Operator II -.09 
monographer .16 
Card-Punch-Machine OperatorHL .55 
Calculating-Machine Operator I . 3 a 
Put-ln-Coll Girl -.19 
Calculating-Macnlne Operator II .59 
Calculator Operator III .58 
Calculator Operator I ,33 
A tiding-Machine Operator .51 
Pull-Socket Assembler .07 
lnepector-Wrapper .01 
Hand Transcriber I .19 
Caloteria Counter Girl -.23 
Departmont-Store Salesperson H -.12 
Hand Transcriber III .09 
Department-Store Salesperson I -.17 
Lampshade Sewer .15 
Cafeteria Ploor Girl .00 
Can Packer .20 
Hand Transcriber II .00 
Power-Sewlng-Machlne OperatorII .50 
Merchandise Packer .35 
Power-Sewing-Machine Operator 1 .40 
Coding Clerk II .26 


Figure 4 

OCCUPATIONAL DIFI ERl'NCEs ON THE MINNESOTA CLERICAL NUMBERS TES 1 
Moans .iml Standaid Deviations after Stead and Shuttle (750). 
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Occupation r 

fKonographer .30 

Coding Clerk 1 .46 

Card-Punch-Machlno OperatorH .24 

index Clerk -.47 

invoice Typist .26 

Cazd-Puncli-MucHlno Operator 1 .32 

Bookkeoping-Mucnlne Operator 1 .19 

Toll-Bill Clerk -14 

Calculator Operator 11 .60 

Calculator Operator V -.05 

Calculator Operator 1 .40 

Calculator Operator IV .15 

Bookkeeping-Macnine Operator 11 .09 

Calculating-Machine Operator II .50 
Calculator Operator III .44 

Card-Punch-Machine Operator ill ,54 

Put-In-Coll Girl -.21 

Calculating-Machine Operator I .33 
Hand Transcriber II .46 

inspector-wrapper .30 

Pull-Socket Assembler .15 

Lampshade Sewer .14 

Department-Store Salesperson 11 ..,14 
Adding-Machine Operator .37 

Cafeteria Ploor Girl .05 

Caleteria Counter Girl -.30 

Hand Transcriber I .34 

Merchandise Packer .45 

Department-Store Salesperson I -.15 
Hand Transcriber II ,14 

Can Packer ,2 q 


Power-Sewing-Macnlne OperatorH * 2 B 
Power-Sowing-Machine Operator I * 2 3 
Coding Clerk ii 


Raw Scores 
35 55 75 95 115 135 155 17 5 


FlC.URl r, 

OCCUPATIONAL 1)11 I I R1NC1 S ON Till MINM SOTA Cl I RICA! M VBIRS ITS! 

Means and Standard Deviations after Stead and Slum It* (750) 

The latest (iq.jfi) manual includes norms fiom all the above groups 
except the USES subjects. Norms for students in a graduate school ol 
business have been published by Strong (781). With the advances that 
have been made in selecting and describing samples ever since the test 
first appeared, it is to be hoped that future editions ol the manual will 
describe even more adequately the groups used in nonning the test. 

One other problem lemains to lie discussed in connection with the 
norms, stemming from the age difleicnees which have aheadv been 
considered. There is a very real question as to which no)))is to use when 
counseling high school students , a problem which Barnette (.pj) has also 
encountered with business college students. This may best be illustrated 
by a specific example. An 18-ycar-old high school senior, let us say, is 
considering taking training to be an accountant, has taken the Number- 
Checking lest, and made a raw score of 106. This puts him at the 50th 
percentile for his grade, the 58th lor his age, and the 74th when compaied 
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to employed adults. So far, then, the picture is one of average or superior 
clerical aptitude, although one might suspect that the superiority of an 
average high school senior when compared to employed adults is the 
result of the selectivity ol high schools. However, when compaied to 
accountants and bookkeepers, the group with which he is to compete, 
his percentile rank drops to the first. Hie counselor must ask hiinseli 
whether this is his true and ultimate standing when compaied with 
accountants and bookkeepers, in which case he should certainly be 
encouraged to consider other possibilities, or whether the poor standing 
is the result ol immaturity and there lore subject to modification by age 
and maturation. If the latter is the case, then he and all of his fellow 
seniors will improve in score, making them even more superior to adults- 
in-general, although it does not seem likely that they should actual!) 
exceed mote than 75 or So percent of the employed adult population in 
clerical aptitudes. I11 view of this last consideration, it seems wise to 
assume that there will not be much change in the raw scores ol high 
school seniors alter graduation (assumption supported also bv the lack 
of lelalionship between age and scores among persons aged 17 to uj 
and above previously mentioned). The adult occupational norms should 
therelore be used cautiously even lor high school juniors and seniois, 
lather than the age or grade norms made available bv Schncidlcr. 

When in due couise more light is thrown on the role ol maturation it 
may be shown to be necessary, and it may become possible, to provide 
conversion tables which will show the probable adult score of an adoles¬ 
cent who has made a given law score; by converting the adolescent law 
scoie to the* adult equivalent, and this to the specific occupational 
peicentile, one will then be able to make a laii evaluation of an adoles¬ 
cent’s piospects of successful competition in a specific cleiical occupation. 

Standindilation and Initial Validation. The seveial times revised 
manual lor the Minnesota Clerical Test lias been more complete than 
most in the presentation of data concerning the standardization and 
initial validation of the test, and has gone somewhat beyond that in 
summarizing subsequent findings—a pattern now fortunately being 
increasingly followed by the more responsible publishers and authors of 
tests. The data which follow concerning the standardization of the test 
are therefore also found in the manual. 

The correlation between Number-Checking and Name-Checking was 
lound by Andrew (21) to be .66. indicating that the tests have a great 
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deal in common but that, since their intercorrelation is lower than their 
individual retest reliabilities of .76 and .83 (>87), at least one of them 
is measuring something not so well measured by the other. 

This was shown by other correlational data to be intelligence, which 
plays a more important part in Name-Checking than it does in Number- 
Checking: in homogeneous groups the correlation between the formei 
and the Pressey Senior Classification and Verification Tests (ol intelli¬ 
gence) was found to be .37, where as the same figure for the latter was 
.12. In heterogeneous groups these correlations lose to .(>5 and .17. 'These 
data bring out another fact important to an understanding of the nature 
of clerical aptitude: in a group of persons of the same general le\el of 
intelligence, such as one normal]) finds in a class in a large high school 
and in a business office, clerical perception is an aptitude which is tin- 
1 elated to intelligence; on the other hand, in a group ol peisons with a 
wide spicatl of intelligence, such as one finds in a class in a smaller high 
school where sectioning has not been possible or in a group of unsorted 
applicants lor clerical employment, those who are moie intelligent tend 
to ha\e more clerical aptitude than those who arc* less intelligent. The 
iclalionship is far from perfect, but it is teal. 

As the test involves reading words and numbers Anchew con elated it 
also with tests ol ) rad mg speed, spelling, and m itlnnctu. Lsing homoge¬ 
neous groups, the correlations between reading on the* one hand and 
Numbers and Names on the other weie respectively .oc) and .\y, the* 
correlation between Names and spelling was .fir,; that between Numbels 
and arithmetic v\as .51. Holding intelligence* constant, since* it plays a 
part in reading and in the Names 1 est, the correlations between reading 
and the* Numbers and Name’s 1 ests changed to .18 and .30. Since* reading 
and arithmetic aie proficiencies and perception an aptitude, one would 
be inclined to assume that clerical aptitude explains the* reading arid 
arithmetic scores, were’ it rrot for the fact tfrat skill in reading a fleets the 
speed of perception of symbols such as those used in the Minnesota 
Clerical l est. Leaving the riddle of the hen and the egg unsolved, it is 
still possible to conclude that, in homogeneous groups, the* relationship 
between reading skill and clerical aptitude is relatively low. In the case 
of arithmetic the riddle is more solvable, for the Numbers Test reejuires 
no computation and is therefore not affected by proficiency in arithmetic : 
the relationship reported must therefore be causal from Number-Check 
ing to computation rather than vice-versa. This may perhaps justify the 
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(onclusion, by analogy, that speed of reading is affected by perceptual 
speed as measured by Name-Checking. 

Andrew attempted to ascertain the relationship between training and 
experience , on the one hand, and Minnesota Clerical scores on the other. 
As these studies have been discussed earlier in this chapter they will not 
be dealt with here. 

The relationship between derical aptitude and success in clerical 
training was asm tabled (1*2). More than 100 (ommercial high school 
students wete rated lor prospects of success in (tabling by their teachers, 
the ratings cot relating .r,K with total Minnesota (derical scores and .33 
with intelligence test semes. '1 he correlations with college accounting 
grades were found to be . jy lor Numbers and . gj lor Names (22). These 
results seem extraordinarily good: unlot innately, it will be seen that 
subsequent held validation has not tended to confirm them. 

The \alidity of the test lor selecting clerical employees was ascertained 
(22) be correlating supervisors’ ratings with Minnesota Clerical Test 
scores. The groups involved ranged in si/e from 22 to 97 workers: the 
reliability of the ratings was not checked. Even with this presumably 
impelled (lite r ion, the test validities ranged iiom .28 to . |2. Subsequent 
studies ol the same tvpc, discussed below, have yielded similar results 

Employed and unemployed deucal wo) tiers were compared (22) in 
order to ascertain whether or not there were measurable differences in 
elerieal aptitude between such groups. The critical ratios were 3.32 for 
Numbers and \ jq lor Names, showing that the employed clerical work¬ 
ers were significantly superior to the unemployed clerical workers on 
these tests. Further analysis showed that the early uncmploved were 
inferior to those who had been released later in the depression, as well 
as to the still cmploved. but that the late uncmploved were not inferior 
to the still cmploved. As it seems logical that the first to be released 
would be those whose services were least valued by employers, and the 
last those whose services were difficult to dispense with, this would seem 
to be a validation of the Minnesota Clerical 'Test against emploverV rat¬ 
ings of essentialitv: an efficiency rating made much more carefully than 
the a vet age rating. 

A final tvpe ol preliminary validation of the test carried out by the 
Minnesota Employment Stabilization Research Institute was the ascer¬ 
taining of the ability of the Minnesota Clerical Test to differentiate 
clerical from nomclerical workers (22). This involves the hvpothesis 
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that the trait measured is truly an aptitude, i or the acceptance of which 
evidence has been adduced, and the further hypothesis that the aptitude* 
is not so widely or generally distributed that people-in-general possess 
it in a degree equal to that charade] i/ing those in the occupation, 
hypothesis which is automatically checked in this type of validation. 



Workers In General Routine Clerks Accountants- 

Bookkeepers 


lien. K1 (» 


OCCUPATION \I 1)11 I 1 kl \c I s ON Till MINN1 Sol \ Cl I UK \[ (\\ MB! UM 1 ! SI 
Showing die percentage* bt each t\pe of worker making a given li u< 1 
grade. Alter Andrew and Pain sun (inn. 

Figure f> reproduces data from the Ml.SRI studies (irj) which graph 
ically portray the* ability ol the Minnesota (.lerical lest »o ddierentialc* 
between wotkcrs-in-geneial and woikeis employed in \arious clerical 
occupations. The distribution ol scou-s lor men-in general is lioimal, 
whereas the higher one goes in the scab* ot clerical occupations the* more 
skewed the distribution becomes. Appioximaiely 7 percent ol the woiker- 
in-gcneral group received letter ratings ol E on the Numbers l ost, while 
no routine clerical workers received a grade of E; in lact, none ol the 
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latter group received ratings of D, and only 3 percent received a grade 
of C. Accountants and bookkeepers, on the other hand, in no cases 
received a rating as low as C, and more than 80 percent of them rated 
A on the Numbers Lest, as contrasted with about 37 percent of the 
routine clerks and 7 percent of the workers-in-gcncral. 

Although the diflerentiation between clerical and non-clerical workers 
shown above is striking, it should not be taken as indicating that no non- 
dental occupational groups excel in what has for convenience been 
labelled clerical aptitude. As the Minnesota norms bring out, miscel¬ 
laneous minor executives, life insurance salesmen, retail salesmen, and 
draftsmen are all above the 80th percentile on the Numbers Test. This 
is pet haps only to be expected, in view of the fact that in all of these 
occupations there is a great deal of record keeping or work in which 
minute details must be accurately and quickly checked. Even policemen 
tate at the (dth percentile. But these scores seem less impressive when it 
is noted that the only male clerical gioup whose median is below the 
# I j st pel c entile when compand to the gencial population is the shipping 
and stock chih categoiy at the 77th peicentile. 

Reliability. The collected split-half usabilities were found to be .85 
1 01 the N umbels Test and .Np loi the Names Test (manual), while tlu 
m test leliabiiities wcic somewhat lower, .7(1 and .83 respectiveh (187) 
lla\ (3;, 8) found letest leliabiiities of .Cm. .Op, and .56 for the Numbers 
Test, and of .7y, .tie. and .81 for the Names 'Lest after intervals of as 
mans as 54 months. 

Valid’t\. Because of its rapid and widespread adoption a number ol 
validation studies have been carried out and published by woikers in 
the field. These studies have included the usual vaiiety of correlations 
with other tests and with educational and vocational criteria. 

The nlationship between Minnesota Cleiical lest scores and inlcl- 
hocnic was 1 lucked b\ Copeland (171) and Super (792). I11 the (ormei 
study, coi lelat ions with the Otis S.A. Test were found to be .34 lor 
Numbeis and .31 lor Names; in the latter study the A.C.E. Psychological 
Examination was used, and the correlations were .i»C> and .(in respectiveh. 
The range ot intelligence and clerical aptitude was probably greater in 
the latter gioup. which consisted of high school juniors and seniors, 
than in the foimcr, which was made up of unemployed clerical workers. 
'This would explain the closer relationship between intelligence and 
the Names Test in Super's study, but not the slight Jv lower lclationship 
with the* Numbeis 'Lest, which mav be due to chancer Both ol these 
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lelationships arc in-between those reported by Anderson and Paterson 
lor mole strictly homogeneous and heterogeneous groups. 1 redick (8(19) 
correlated the Numbers and Names Tests with Thurstone’s Primary 
Abilities Tests. T he former had low r’s with verbal, memory, induction, 
and reasoning tests (.06 to .28); the latter with only the memory test 
(.24). Other r’s were above .36. Both tests are heavily loaded with per¬ 
ceptual and numerical factors, while the Numbers Test is weak in the. 1 
\erbal and reasoning factors. 

The telationship between the clerical test and the Co-opeiati\e Survey 
Test in Mathematics was computed in an unpublished study of high 
school juniors and seniors, with negligible relationships resulting ( — .07 
and —.10). This does not seem to agree with Anchew’s finding of .7,1 for 
Numbers and arithmetic, but may be due to the more advanced mathe¬ 
matical content of the Co-operative test, which requires leasoning mote 
than 1 online* computation. 

'The relathe \alidity of the Minnesota Clerical l est and the Geneial 
Clerical Battery of the United States Employment Scrsice was ascertained 
b\ Chisel 1 1 ( 283 ), who administered both to a group of r,f>2 worheis. 11 is 
analysis showed that the lattei added nothing to the lot met, which was 
adequate for counseling use. 

Teachers' ratings of written wo)h were used as a critetion by Swem 
(S09). His subjects were 35 boys and 34 gills emollc'd in high school 
courses. For the former the correlations with Xumheis and Name's Tests 
were .30 and .{9 respectively; for the lattei they wcie .05 and .gj. Only 
the correlations for the Name's 'Test were statisticallv reliable*. These 
findings contrast unfavorably with those reported in the original studies. 

The iclationship between Minnesota Clerical scores and grades m 
typing and shorthand, was analyzed bv Bairett (jC>). working with gioups 
of 96 and 75 college students. Unfortunately her analysis was not made 
in terms of coiielations 01 similar statistics, but inspection of her data 
shows a tendency for those who made higher scores on the Minnesota 
test to make higher grades in both typing and shotthancl. Tredick (8(19) 
found correlations of .08, .31, and .27 between grades in Art, Chcmistiy, 
and English (Composition on the one hand and Numbers on the' other, 
the figures for Names were .07, .26, .07. Correlations with average grades 
were .36 and .23. The subjects were 113 fieshmen women in Home 
Economics. 

An examination in machine (alrulation was used as a diterion with 
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r, i women students oi office practice by Goltsdanker (303), whose battery 
of tests included a slightly modified version of the Numbers lest. The 
correlation between his Number-Compar ison Lest and the criterion was 
.29; when combined with the Tapping "Lest of the MacQuarric Mechani¬ 
cal Ability Lest, a Number-Dot Location Test (a “paper keyboard”), and 
an Arithmetic Computation lest, the multiple con elation coefficient 
was .57. 

Output cm the job served as ciiterion in a study of 39 bookkeeping 
machine operators by I lay (35N). Speed ol posting was used as an inde x 
of output fen, as Hay points out, operators aie not permitted to temain 
at work if they make enors and so learn to work at an accuiate speed, 
making speed the best index of success. I he reliability of the production 
trials was checked, and was found to vaiy around .90 for any one tiial 
peiiod; when inter-trial reliability was checked, it was somewhat lower, 
but in no castes lower than .72. With this caiefully studied uiteiion the 
correlations for Numbers and Names Tests were .51 and . j7, respectively. 
When these two tests were combined with the Otis S.A. Test a multiple 
correlation of was obtained. Hay has used this batten in a huge 
bank for a number of years with cut-off scores of 1 i ;o for the Minnesota 
R“Ms (;pyj). 

Supcrvisun' ratings of the efiiciencv of clerical workeis were used as 
criteria in another study (193), in which the validity of Nimibce and 
Names Tests was foutid to be .27 and .29. When promotability was 
estimated bv job level attained after five or mote years of set vice and 
con elated with the same* tests, the coefficients weie .07 and gjj. The 
Thurstone Examination in Clerical Woik (a piofuiency test), and a test 
of the same type by O’Rourke had validities ranging from . jo (efiiciencv) 
to .77 (promotability). This is not the reflection on the Minnesota test 
that it might seem at fust glance, because the former instruments are 
te sts of mixed functions, comparable to a batter v. whereas the Minnesota 
is a purer test of two factors only, perceptual speed and. to a lesser 
extent, intelligence. It is to be expected that tests ol clerical tasks would 
correlate more highly with efiiciencv ratings than a test of perceptual 
speed, and that tests as heavily loaded with intelligence factors as the 
Thurstone and O’Rourke would correlate more highly with pro¬ 
motability. When selecting new workers, however, there are important 
advantages in using a batten ol purer tests, one of intelligence, one of 
perceptual speed, and one of arithmetic or language usage, depending 
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upon the type of clerical work. In Hay’s work (358), ior example, the 
first two proved sufficient, because in that case neither arithmetic nor 
language was of special importance. 

In the USES study (750), data from which are reproduced on pages 
i(k) and 170, a battery of tests including the Minnesota Clerical Test 
was administered to various groups ot clerical workers. For two samples 
totaling 234 card-punch machine operators (sex unspecified but pre¬ 
sumably female) the criterion was the average number of cards punched 
per hour, with cadi incorrectly punched card counted as one* error, 
combined into an “errorless production” score. The reliabilities of the 
two components, cards punched per hour and number of errois, were 
about .<j5 for the former and .90 for the latter. Coding clerks (\ — 96), 
bookkeeping-machine operators (N = 52) and hand-transcribers (N — 62) 
weie studied with a similar battery and criterion; for calculating-mathine 
operators (X = 80) and adding-machine operators (N = 26) the criterion 
was a worksamplc. For card-punch machine operators the validities were 
.31 and .33 for Numbers, and .24 and .32 for Names; these were among 
the most valid tests in the battery, only a letter-digit substitution test 
being as good and the MacQuariic* subtests having no consistent validity. 
The \alidities for coding clerks were .38 and .46 for Numbers and 
Names, validities equaled by a number-writing test and exceeded by a 
personal-data test. In the case of the bookkeeping-machine operators the 
validities were —.09 and .19, although, as will he seen later, this group 
tended to make high scores on the tests and Hay (758) found validities 
of .51 and . \y: pci haps the difference lies in the criteria, the USES having 
used an error criterion while Ilay used a speed criterion which he con¬ 
sidered mote valid. 'That Hav’s ciiterion was superior is suggested by the 
lelatively low validity of the other tests in the USES batten, none of 
which exceeded .28. For the hand-tianscribers the coefficients were .20 
and .3], again among the best of the battery, sememe-completion, 
vocabularv, and numbet-writing tests being in the same lange. Validities 
for calculating-machine operators were .34 and .38 lor Numbers and 
.Names; for adding-mat hine operators, .51 and .37. For the former group 
MacQuarric 'Fracing and Location, and a number-finding test were also 
valid; for the latter, all of the MacQuarric subtests except Tapping and 
Dotting had some valiclitv, ns did vocabulary, number-finding, and an 
arithmetic test. 

Data for other clerical gioups, including some comparable to those 
just discussed, and lor a number of semiskilled jobs in which it was 
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assumed clerical perception would he important, aie reproduced in 
Figures 4 and 5, pp. 1 Gg and 170. Worthy of note are the substantial 
negative correlations between Numbers (—.39) and Names (—.47) and 
the ratio of errors to the production ol index clerks, whose average scores 
are more than one sigma above the women’s mean: this suggests that 
in this occupation a high level ol perceptual ability is desirable, but 
that those who arc too much above the critical minimum aie likely to 
be the poorer workers: whether or not this is because their rate ol work 
is too last for the precision requirements of the job is not shown by the 
data. Also noteworthy, in view of the Blum and Candee study cited 
below, are the correlations ol .015 and .30 between Numbers and Names 
on the one hand and ratings of inspector-wrappers on the other, and those 
of .35 and .45 between the same tests and production records (ratio of 
time required to standard time per unit) of merchandise packers. An¬ 
other non-clerical job lor which the test had some validity was power- 
sewing-machine operator (.40 to .50, and .23 to .28). 

Blum and Candee (rod) tested 317 seasonal and 55 permanent packers 
and wrappers in a department store. In the permanent group the Num¬ 
bers Test had a correlation of .57 with packers’ production, and the 
Names lest one ol .hr, with wrappers’ production. In the seasonal group 
only manual dexterity was important. The authors’ conclusion that the 
initial job adjustment of packers is somewhat aflectcd by speed oi gross 
arm-and-hand mencment. while long-term superiority is more dependent 
upon clerical speed and accuracy, seems legitimate. But the clifleicntial 
results lor Numbers and Names, packers and wrappers, need further 
investigation before the matter is closed: the USES study showed that 
both were \alicl lor packers. 

In the study of pharmaceutical inspector-packers Ghiselli (286) worked 
with 2b young women who were rated by their forewoman and super- 
yisor. The correlation between the two sets of cnerall ratings was .72, 
which was considered adequate reliability and justification lor combining 
the two to serve as criterion. The correlations ysith Minnesota Numbers 
and Names Tests were respectively .29 and .26. 

Apparently packing work of both gross (department store) and fine 
(pharmaceutical) types requires speed and accuracy of perception such 
as is measured by the Minnesota Clerical Test. Just why the gross type 
should require it more consistently than the fine is difficult to see. It 
will probably not become clear until other studies of these and othci 
packing jobs are made with the same tests, in combination with detailed 
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job analyses. It would be illuminating, for example, to know whether 
it is speed in recognizing numbers and names, as in the Minnesota test 
and in clerical actisities, which is important in packing and wrapping, 
or whether it is general perceptual speed and accuracy such as might be 
measured by other speed of discrimination tests. II the former, the Min¬ 
nesota test is perhaps truly a test of clerical aptitude; if the latter, it is 
more probably a perceptual test measuring something of value in a 
\arietv of occupations. The data on power-sewing-machine operators 
suggest that it is the latte). 

Two new studies hn\e checked the ability of the Minnesota Clerical 
Test to dilfemiliatr between persons in clerical and non-clerical occupa¬ 
tions. In one investigation Barnette (.j j) found that business college 
students were superior to general adults, but inferior to clerical woikers, 
on both Numbers and Names Tests. One would expect this of a student 
group, some of whom were likelv to be weeded out before establishment 
in the occupation, unless they were preparing exclushelv for the higher 
levels of c lerical work. 

The other stitch is from the United States Employment Sen ice studies 
in occupational analysis, previously discussed, and cited by Stead and 
Shartle (7yo: Ch. S and pp. im 7-225). As the sex composition of the 
occupational groups is not specified, comparing them with general 
population norms imohes the assumption that most of the clerical 
workers studied were women; this is probably a legitimate assumption. 
When comparing one clerical group with another the procedure is made 
more justifiable by the fact that men and women in a given deiical job 
are lound to be ecjual in clerical aptitude. As Figures .] and 5 (pp. if>() 
and 170) show, there is a definite tendency lor clerical workers to make 
higher scores on the Numbers Test than workers in the semiskilled 
occupations to whom the same test was administered. The mean scores 
of almost all of the 1 clerical jobs tested were a bine the mean of the 
MESRI standard sample of employed adults, hand transcribers and one 
sample of coding clerks being the only clerical workers whose aserage is 
lower than the adult a\erage. A cut-off score of 122 (about one sigma 
above the adult women’s mean) would include all of the clerical workers 
above the mean of their group except those just mentioned and ten-key 
adding-machine operators; of the 12 non-clerical jobs included in the 
list, only the put-in-coil girls have a mean score as high as this. If Hay’s 
critical score for bookkeeping-machine operators (also his mean) of 130 
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were used, all of the above-average bookkeeping-machine operators and 
other comparable clerical workers would surpass the critical score. 

The data for the Names Test show similar trends, although May’s 
cut-olF score of 130 appears to be too high for this test: 105 or 110 would 
be comparable to that used for the Numbers Test, although the latter 
is about the mean for adult women. The differentiating power of the 
Minnesota Clerical lest revealed by these data is greater than it at first 
seems, because the non-clerical jobs in the occupational sample were 
included on the assumption that perceptual speed as measured by the 
Minnesota test would be important in them too, hypothesis proved 
\alicl lor some by the 1 (ported validity coefficients. It is noteworthy, 
howe\ci, that these non-clerical jobs in which clerical perceptual speed 
is important almost invaiiably rank lower in the amount typical of their 
workeis than do the cleiical jobs themselves. 

One of the objectives of vocational counseling and selection is the 
attainment of satisfaction in 1 m work by the worker. This being the case, 
one would expect to find studies of the relationship between cleiical 
aptitude and job satisfaction. No such studies have been located, how- 
e\ei, tlu- emphasis having so far been entirely on success. 

Cse of the Minnesota (denial TVs/ in Counseling and Selection . T he 
pieceding discussion has brought out the fact that the Minnesota 
Clerical T est has value lor distinguishing those who have promise for 
clerical woi k from those who do not, and that the higher the scoie made 
bv a pet son the highci, other things being equal, he may rise in the field 
ol cleiical woik. Even though persons in the highest level clerical jobs 
arc characterized by more perceptual ability than those in lower-level 
clerical work, one is not justified in assuming that this is all that need 
characterize the aspirant to high-level clerical v\ork. We have seen also 
that while perceptual speed is more important in routine clerical work 
than is intelligence, intelligence is probably more important in promotion 
to the higher levels than is perceptual speed. 

When appraising clerical piomise it is well, therefore, to use tests of 
both perceptual speed and intelligence. If a batten can be used, it should 
include the Minnesota Test (Numbers and Names) and an intelligence 
test such as the Otis. If time is at a premium, the Minnesota Numbers 
Test and the Otis will do. If only one test can be used, and it must be 
brief, then the Minnesota Names Test, as a combination of perceptual 
speed and intelligence, may suffice. In selection programs, if the selection 
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is to be made from a wide range of ability an intelligence test may sulfite 
as a screening instrument, because of the correlation between the two 
aptitudes in heterogeneous groups. But if the selection is to be made 
from groups with a limited spread of general intelligence, the Minnesota 
Numbers Test is preleiable as a purer measure of the important variable. 
Although the dilfeielites between experienced and inexperienced woi It¬ 
ers on the Minnesota Clerical Test were slight, and probably due to selet- 
tivc factois rather than to experience, it is worth noting (until more 
conclusive evidence is available), that at least one personnel worker 
(Hay) has thought it athisable to use separate norms in selecting experi¬ 
enced and inexpei icnccd clerical workers. 

In counseling the principal problem which is raised by the research 
is that of age and occupational norms. Although increase in scores with 
high school giade and age has been demonstrated, it is not clear to what 
extent this is due to maturation and to what extent to the elimination of 
the less able students as they reach the higher grades. In view of the fact 
that there is no change in scores with age from ages 17 to 2<j, and since 
the age changes in mid-adolescence are open to some question, it seems 
wise to use adult norms even with high school juniors and seniors until 
more adequate evidence is available on the ellects of mat mat ion. When 
tlu* test is being used at the junior high school level lor curricular guid¬ 
ance purposes grade norms are to be preferred, as matination may play 
a significant part at that age and school work can provide an exploratory 
experience which supplements the test scores. Obvioush, students who 
take commercial courses in high school should have appiopriate mental 
ability and more than average clerical aptitude. Since directional guid¬ 
ance is all that is needed at that stage, the more specific decisions can be 
postponed until a later age when tests and experience yield more specific 
evidence. 

In using the adult norms, emphasis should be on the occupational 
rather than on the general norms. It only confuses the issue to know that 
a man is at the 7 jth percentile in accounting (Number) ability compared 
to men-in-geneial, when in reality he exceeds only 1 percent of account¬ 
ants in that t\pe of ability, for it is against accountants lather than men- 
in-general that he must match his accounting aptitude. However, these 
occupational norms must still be used with considerable caution, since 
they are based 011 small and relatively nondescript gioups whose repre¬ 
sentativeness is unknown except for the rough correspondence of MESRI, 
USES, and Hay’s norms. Most guidance centers should be able to develop 



CLERICAL APTITUDE: PERCEPTUAL SPEED 183 

norms of their own which are more adequate lor local use than the 
original Minnesota occupational norms, but these should be occupa¬ 
tional norms, not norms for all clients locally tested. 

In administering the Minnesota test lor non-clerical put poses as in the 
counseling or selection of semiskilled workers, it is well to supplement 
the directions with a statement that the test is a measure ol the speed 
with which details are noticed, and that this is an important character¬ 
istic in a number of assembly, inspection, and other jobs. This helps to 
counteract the antipathy of some examinees to anything with a clerical 
label. 

Finally, a word concerning speed and accuracy. We have seen that as 
a rule speed on this type of test is a good measure of accuracy. But there 
are occasional exceptions, and one subject will make a given score by 
working rapidly with errors, whereas another will make the same score 
by working more slowly without errors. For this reason the psychometrist 
or counselor should examine the lesponses to each test, and take the 
error score into account in making his interpretation. While it may not 
help as much in judging prospects of success as the total corrected score, 
it will help considerably in understanding the person being evaluated 
01 counseled. 
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MANUAL DEXTERITIES 

Nature and Role 

Singular or Plural? Personnel men, vocational counselors, and psy¬ 
chologists have long been in the habit of referring to manual dexterity 
as though it were a unitary ability. If this were so, then it would be 
legitimate to conclude that a person who is adept at one manual activity 
has the aptitude to become equally adept at any other manual activity. 
It would also be true that one good test of manual dexterity would be 
sufficient in a battery used to survey the assets of a student or employ¬ 
ment applicant. 

The plural form has been used in the title of this chapter in order to 
stress the fact that the research of the past decade (735) has demonstrated 
the existence of at least two types of manual dexterity: gross and fine. 
Another way of describing them might he as manual dexterity and 
finger dexterity; or confusion might be avoided if the terms arm-and-hand 
and wrist-and-finger were substituted for these. Further study may in 
due course re\eal that even this breakdown is inadequate, and that 
dexterity is in reality a continuum, gross at one extreme and fine at the 
other, at least in a logical sense. The use of different anatomical parts 
in gross and fine manual activities may, however, justify treating arm- 
and-hand and wrist-and-finger dexterities as discrete aptitudes. It will 
be seen below that, at least as measured by the tests now available, these 
two types of dexterity are relatively distinct and unrelated to each other. 
Furthermore, a factor analysis study of 59 different aptitude tests con¬ 
ducted by the United States Employment Service (735) revealed two 
dexterity factors, one of which was common to the Placing and Turning 
Tests of the Minnesota Rate of Manipulation Test and to the Peg Board 
Apparatus of the USES (both of which require relatively gross move¬ 
ments), and the other important in tests requiring fine assembly work. 
Most relevant of all are studies by Seashore (698) and Buxton (127), using 
laboratory tests, in which factors which appear to consist of manipula¬ 
tive, wrist-turning, arm-and-shoulder, ballistic (uncontrolled), steadiness, 
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and one unidentifiable motor skills were isolated. The first three appear 
to be distinct, anatomically based, dexterities. No tests have yet been de¬ 
vised which suggest intermediate degrees ol fineness, although some have 
been investigated which require varying combinations of arm-and-hand 
and of finger dexterities. 

What is Manual Work? Another set of distinctions which needs to 
be made early in the discussion of manual dexterities is those between 
manual woi k and mechanical work, manual dexterities and mechanical 
aptitude. White-collar workers and professional people who have not 
had intimate contact with industry often confuse manual and mechanical 
work and skills, taking note only of the fact that both involve use of the 
hands. Aware that some factory and shop work is skilled, some semi¬ 
skilled, and some unskilled they assume that these distinctions in the 
degree of skill characterizing the work arc distinctions in degree of 
manual skill. Hence the- unwan anted conclusion that the higher the 
le\el ol skill in industiial employment, the greater the need for manual 
dexterit). 

As experienced industrial men and personnel psychologists ha\e Jong 
known, nothing could be further from the facts. The independence of 
measuies of manual dexterity and of mechanical comprehension or 
spatial \ isuali/ation will be brought out in subsequent parts of this 
chapter and in the two which follow’, as will the different degiees to 
which manual (unskilled and semiskilled) workers on the one hand and 
mechanical (skilled) woikcrs on the other hand tend to possess these 
aptitudes. It should suffice to point out here that “manual” work is 
essentially semiskilled or unskilled; semiskilled work relics primarily 
on the manual skill of the worker in assembling objects, packing them, 
or in other wa\s manipulating them with fingers or arms and hands, and 
unskilled woi k depends primarily upon the strength of back and legs 
and body co-oidilution rather than e\c-hand co-ordination; skilled 
work, on the other hand, is more dependent on the understanding and 
planning of tlu- worker than upon mere manual dexterity. IT) put it in 
e\ei\day industrial parlance, the skilled worker needs “know-how’,” the 
semiskilled worker skillful hands and fingers, and the unskilled worker 
a strong back. 

A unique contribution to the understanding of manual skill and the 
nature of semiskilled work was made by Cohen and Strauss (ids) in a 
study of 21 experienced women employed in a highly repetiti\e opera¬ 
tion. The task consisted of folding an 18 X 18-inch gauge sheet to a size 
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approximately 4 X 4 inches, and required six foldings. Motion pictures 
were taken of the operatives at work, and operation analysis was made. 
It was found that, in general, the more skilled opeiatives (so classified 
by a standard time-and-motion study technique) performed their work 
more simply. This greater simplicity of technique was illustrated by 
several differences in methods. Better operatives have fewer limiting 
grasps and releases, in that they grasp and release as a part of transport 
operations rather than as separate movements; their movements are more 
global, less discrete, than those of inferior operatives. The more pro¬ 
ficient operatives make the movements of their two hands overlap more 
than the less proficient workers, thus performing two operations at once 
instead of one after the other. The Purdue Peg board, described later 
in this chapter, is almost unique in testing this type of two-hand co¬ 
ordination. Poorer opcratiAes make more extra moves because of fumbles, 
faultily performed operations, and superfluous operations than do the 
better operatives; the latter therefore have a shorter work c\cle than the 
former, and a higher rate of production. 

Superior skill manifested itself not only as greater speed of performing 
basic operations but, the above makes cleat, as improvement in the series 
of basic operations perlormed. The authors therefore asked, “Is method 
independent of skill?” Their answer is an affirmative for general method, 
but a tentative negative for the basic operations. An illustration helps 
to make the point: “One operator releases a part during a motion 
rather than after it has been made, but if the less skilled operator 
attempts to do so, the part may not be placed correctly and an adjust¬ 
ment max be necessary. Therefore the first operator can pci form without 
the occurrence of ‘Release’ as a limiting operation, but the second can¬ 
not” (162:152). It is the accumulation of such small differences which 
differentiates operatives. Cohen and Strauss feel (without evidence) that 
the problem is primarily one of selection rather than of naming, and 
suggest that dexterity tests are needed which can measuie the abilitv to 
eliminate limiting motions or to merge them into more global move¬ 
ments. Although no available dexterity tests yield scores of tfiis tvpe, Test 
IV of the Purdue Pcgboard (see below) provides excellent opportunity to 
observe this type of skill, and other dexterity tests give some clues. 

Typical Tests 

The best known test of arm-and-hand dexterity is the Minnesota Man¬ 
ual Dexterity Test, better known as the Minnesota or Ziegler’s Rate of 
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Manipulation Test. No other test of this type has been widely studied 
or used. Wrist-and-finger dexterity tests include O’Connor’s Finger and 
Tweezer Dexterity Jests and the Purdue J’egboard, the latter also a 
measure of arm-and-hand dexterity. Both are dealt with in this chapter. 
Other tests of this type are the Pennsylvania Bimanual Worksample (Edu¬ 
cational lest Buieau) and the peg board and plier-dexterity tests of the 
United States Employment Service. These and others like them are not 
treated here, because thev are newei and Jess well validated or not gener¬ 
ally available. 

The Minnesota Rate of Manipulation Test (Educational Test Bureau, 

1 93 1 ) 

The Minnesota Rate of Manipulation Test was originally developed 
by Zieglei as the Manual Dexterity Test, in connection with a study of 
the role of manual dexterity in performance on the Minnesota Spatial 
Relations Test. For tins reason it was not, unfortunately, included in the 
Minnesota Mechanical Abilities Project (588), although several other 
tests designed to measure dexteiity were used; it was, however, available 
in time for inclusion in the research of the Employment Stabilization 
Rest ate h Institute* (>89) - It has been published in two editions, one by the 
Mechanical Engineering Department of the* University of Minnesota, the 
other b\ the* Educational Test Bureau. I he latter version diflets lrom 
the former in the aiiangement of pans at the beginning and the end of 
the test, in the* number of pails (bo vs. 58), and in the colors used on the 
movable parts, as the university version was used in the extensive norma 
tive wotk of the* MESRI only it should be used with the emploved adult 
and special occupational-group norms gathered bv the Institute. This fact 
appeals to have been disregarded bv the publishers of the other version, 
who give norms for 500 unidentified adults which seem to be those o l 
the Minnesota pioject. The Educational Test Bureau version is mot; 
widelv used despite this fact, probably because of a more finished manu 
factoring job which includes a tray to hold the formboard and parts, 
combined with more aggressive marketing methods. Supplementary 
norms for this lorm of the test are available, in the literature, as will be 
seen below, but the manual has not been revised in the necessary detail. 

Applicability. The Minnesota Rate of Manipulation Test was designed 
for use with and standardized on adults. It has generally been assumed 
that it is applicable at any age level between 13 and 50 (91), dexterity 
being a characteristic which matures relatively early. However, the old 
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Educational Test Bureau norms show that men and women arc faster 
than boys and girls, and Tuckman (877) found even greater differences 
between adults and adolescents. According to his data, for example, a 
raw score of 232.5 is equal to the 50th percentile for boys, but the 27th 
percentile for men. The question is raised as to whether these differences 
are due to the selection of the samples (clients of a guidance center may 
come for different reasons, from diflerent backgrounds, at different ages), 
to differences in the motivation of the two agc-gioups (the boys may 
consider manual tasks beneath them while the adults are more realistic 
in their vocational objectives), 01 to the role of maturation (manual 
dexterity may still be developing in the boys). The study was not so 
planned as to throw light on these \arious alternatives. Seashore (f>ejf>a) 
has shown that college men do substantially better than the norms. Per¬ 
haps in the future more persons planning test research will recognize 
the futility of merely compiling normative data for relatively undescribed 
groups, and so set up their research as to pro\ide for answers to questions 
such as these. As in the case of the Minnesota (Helical lest, the ap¬ 
plicability of this test to adolescents is still in doubt. 

Content. The test consists of a foi inboard in which aie four rows of 
identical holes, with fifteen holes in each row. Sixty identical discs, each 
somewhat larger than a checker, fit into these hole's, the thickness of the 
discs being greater than that of the lormhoard so that they may he reaclih 
grasped while in place. The flat sides of the discs aie diflerentlv painted, 
so that they contrast with the' boaul and so that a ready cluck mav be 
made in the Turning Test (Educational Test Bureau form only). This 
test consists of administering the test with the discs in place, but to be 
turned o\cr and returned to their places bv the examinee; the Placing 
Test (both forms) invoices moving the discs from the table-top to the 1 
holes in the formboard. 

Administration and Scoring. The test is administered individually, 
with the subject standing at a table of noimal height. 1 he examiner 
places the board with discs on his own side of the table, leaving a little 
more room between the board and the examinee’s edge than is required 
to accommodate the hoaid. The formboard is then raised, leaving the* 
discs on the table and undisturbed. The formboaid is then placed be¬ 
tween the discs and the examinee, about one inch from the edge of the' 
table. All this is as recommended in the manuals; administration is fur¬ 
ther simplified if the psychoinetrist uses a light board or tray open on 
one side as a base for the formboard, sliding the latter off the base or tray 
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to place the discs and putting the base back under the formboard when 
placing it in iront of the subject for testing. 'This makes it possible to lift 
. nd remove the formboard without losing discs, and has them in place 
for the next administration. The test is administered in lour trials, 
retjuiring from six to eight minutes all told. 

T he scoring used in the original MLSRI studies added all four trials; 
Darlcy (187) has shown, however, that greater reliability is achieved by 
using the first trial as practice and adding the time required in the last 
three as the score. The revised Minnesota manual gives appropriate 
norms. 

Variations have been tried also by Jurgensen (413) and A Vi Ison (932). 
In the former study Jurgensen used nine methods ol administration, some 
involving use* of one hand, some the other, and some both. (When both 
hands aie used, blocks are picked up from the same row , in adjacent 
(ohnnns, except in the last, odd, column.) Although he concluded that 
his revision is mote valid and more reliable, and that the part scores 
aie more inde pendent than in the-standard version, this method has not 
been widelv taken up. It n< \ertheless me rits consideration, along with 
other \ariations, when the- test is to be* validated as part of an ernplovee- 
selection piogium. lor some* variations will almost certainly be more valid 
foi some* jobs and less so lor others, because of the* operation of specific 
factors. Wilson’s modification consisted of using only the lowest of three 
trials rather than the total time*, but lie gives opinion rather than evi¬ 
dence*, convenience rathe r than validitv, as justification for the procedure. 

Xorm s. As v\as picviouslv indicated, norms for the University of Min¬ 
nesota form are available, foi the Placing Test onlv, for the MESRI 
standard sample* of ;,oo adult workers, and for about a do/cn occupations 
such as but ter-wrappers, food-packers, bank tellers, tvpists, and garage 
mechanics, represented bv from 1 } to i(ij persons each. Although these 
small occ up.itional groups were suflicientlv large to supplv answers to some 
of the questions studied at the* Minnesota Institute, and are more varied 
than those upon which most aptitude tests antedating World War 11 were 
based, ihev ate not satisfactory for vocational guidance or selection. Data 
based on them throw a great deal of light oil the nature of the traits 
tested, but it is altogether possible that norms based on larger and moic 
representative groups would differ considerably from these. The best test- 
c(instruction projects of the war and post-war era have recognized the 
need for larger as well as more var ied norm groups: as these projects are 
completed and become* better known the better norniing of tests such as 
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this will become a necessity, from a marketing as well as from a profes¬ 
sional point of view. It might be added, parenthetically, that this rcla : 
tively new recognition of the need for large-scale occupational norms is 
virtually taking test construction out of the hands of individual psycholo¬ 
gists, who will still originate test ideas, and is putting it in the hands of 
consulting organizations and test publishers who have the financial re¬ 
sources to subsidize the extensive standardization research which must 
precede publication. It will also take test publication out of the hands of 
publishers who merely print and sell tests without carrying on or sub¬ 
sidizing test standardization. 

The Educational Test Bureau form supplies norms for both Placing 
and Turning Tests, and lor three additional variations developed In 
Jurgcnsen (.J13); as pointed out above, MESRI norms which are included 
in the Bureau nouns in some unspecified manner should not be used 
with this difleient form until evidence is produced to show that the 
difference in the formboaids does not affect performance. Subsequent 
studies by Tcegai den (S 15,81 fi), Tutkinan (877), Jurgensen Seashore 

(696a) and Cook and Bane (170) have used this form, and make available 
other sets of norms, leegarden’s aie perhaps the most useful, lor she 
sampled a dozen jobs represented by applicants at the Cincinnati Employ¬ 
ment Center, a white group ranging in age fiom ib to 25. As thev were a 
young group, their experience was somewhat limited and theii occupa¬ 
tions in many cases as yet unsettled. In her first two papers (815) Tee- 
garden gives norms for this gioup of 500 voting men and 3 Go women taken 
as a group; in the last paper (81 (>) she gives data on occupational dif¬ 
ferences. The fields represented include such entry jobs as helpers in 
skilled trades, operatives of factory machines, factory operatives (hand), 
packers and wrappers, restaurant workers, and assemblers, inspectois. and 
testers, together with more adult occupations such as manual laborers, 
truck drivers and chaufleurs, and sales clerks. The numbers in these oc¬ 
cupational groups, as in the MESRI samples, were small, ranging from 
2G (truck loaders and helpers) to 123 (women domestics). Like the MESRI 
norms, they give one an understanding of the test and of the significance 
of arm-and-hand dexterity in various types of woik (topic dealt with 
below), but they are neither large enough nor well-enough selected to 
serve as norms in the usual sense of the word. 

Tuckman’s norms are for 1117 subjects aged 18 to 58, tested at the 
Jewish Vocational Service in Cleveland. This group was interested in all 
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types ol work, and had varying amounts of education and mental ability, 
but as clients of a guidance and placement office they were not representa¬ 
tive of adults-in-gencral: 365 were high school students, 407 adult men, 
and 3j5 women, and the mean age was 22. The Cleveland boys’ and 
girls’ Placing norms were approximately the same as those provided by the 
Educational Pest Bureau, but the adults were faster than the original 
norm group; on the* Turning 7 'est, all of the Cleveland groups weie 
I aster. Men excelled most in Placing. 

Jmgcnsen tested 212 male papei-mill operatives aged 18 to 31. These 
1101ms wetc combined with MI.SR I and other data in a way not indicated 
by the nj jb manual. Scashoie’s data are for two groups of 9O and 48 
college men. I hcv did much bettei than the noun gioup. 

Cook and Bam- tested .j(JS men and 2007 women applicants for manu- 
Lu tuinig employment, pio\iding new nouns for 18 to 2", year-olds. This 
group differs bom Teegardrn’s and Turkman's in that it was a factory 
population, at least tcmpoiai i 1 \: Tcegarden’s subjects weie willing to 
acce]>t “anything.” but some weie clerical, sales, and service workers bv 
bac kgiound; and ’linkman’s included many from the professional and 
managniai le\els. the median Otis percentile being 71 for men and r,2 
lot women. \s might be expected under the 1 itcumstances. Cook and 
Bane’s nouns dillei bom the old. being higher. Like Tinkman, the\ 
found that sex 1101ms weie needed foi the Placing, but not for the Turn¬ 
ing Lest. 1 he writet is inclined to believe that Tcegarden’s norms are 
the most helpful to the user of the Educational lest Bureau form in 
counseling, since, like the* MKSRI norms lor the university form, the\ 
make possible some dilleiential interpretation: but they should not, for 
leasons gi\cn above*. be used mechanically. In selection, local norms 
should he cle\eloped, using the available occupational norms only as 
a son 1 ce of ideas as to situations in which the test may prove useful. 

Standm dilation and Initial Validation. The oiiginal work with the 
Minnesota Manual Dexleiity l est, apart from the study of its role in 
spatial \isuali/ation tests, having been carried out as part of the opera¬ 
tions of the Lmplovment Stabilization Research Institute, the standard¬ 
ization and initial validation data have to do only with the reliability and 
occupational differentiation of the test. Its reliability is taken up below. 
Its ability to clillereiitiate workers in various occupations was demon¬ 
strated bv the occupational norms discussed above. Highest scores were 
made bv women butter pac keis and wrappers and by women food packers, 
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who stood at the 9-jth, 92nd, and 88tli percentiles respectively, while semi 
skilled workeis in genetal stood at the b.jth. Apparently arm-and-hand 
dexterity is important especially in packing and wrapping jobs. 

Bank tellers were at the 85th percentile, ranking highest among cleiical 
workers, with men office clerks at the 77th, which suggests that, although 
there is no correlation between manual dexterity and success in oflue 
work, office workers are a somewhat select group in dexterity. Since the 
1O4 women office clerks were onb at the both percentile when compared 
to general adults, and since the male office c lei ks weie only (i(> in niimbei. 
this may be partly a result oi sampling. It is piobably wise* to suspend 
judgment concerning the* importance of manual de\teiit\ in cleiical 
woik, operating on the conclusion that the ciitical minimum is lathei 
low, a conclusion which is in accord at least with the data concerning 
women clei ks. 

Finalh, it should be noted that the skilled groups tested in the Minne¬ 
sota project did not dillei great!) fiom the mean ol the genet al adult 
population. Skilled workeis in general averaged at the (kith pete entile, 
while garage mechanics, to cite a specific example, wete at the h when 
compared to emploud adults. This beats out the statement made* at the 1 
beginning of this chaptei, to the ellect that skilled wotkers depend not 
on their manual skill, but on other aptitudes and upon technical knowl¬ 
edge. 

Reliability. Dailey reported that the reliability of the Placing lest 
was above .90 for the standard sample (187). Turkman also used the* odd 
even method (878), leporting corrected re liabilities of more than .90 foi 
his samples. He obtained retest reliability coefficients which were slight 1 \ 
lower, probabh bee ause of practice efTect. In this connec tion, he confn med 
Barley’s finding that it is best to use the first trial for piactice; the mean 
score for the first trial was at the 52nd percentile, while that for tlie- .jth 
trial was at the 79th, a substantial improvement. Jurgensen (413) found 
reliabilities of .87 and .91 lor 2 12 adult men. 

Validity. The fust step to be taken in ascertaining the validity of the 
Minnesota Rate of Manipulation Test would seem to be to determine how 
independent the two paits are. This was done by Blum (106), Jacobsen 
(39b), Jurgensen ({13), Seashore (Gcjba), Teegardcn (815) and Tuckman 
(878). The first obtained a correlation of .55 based on 120 women packers 
and wrappers, which compares very favorably with that of .57 reported in 
the test manual. Jacobsen’s intcrcorrclation was only .27, for 90 aiicralt 
industry trainees. Jorgensen's intercorrelation was .52 for 212 adult male 
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paper-mill operatives. Seashore reported correlations of .46 and .58 for two 
samples ol college men. Turkman’s figures were higher, being .60 for 34^ 
women and .bb lor 407 men. Teegarden’s were still higher, at .65 for 
171 women and .73 lor 230 men. Presumably the true correlation is about 
.bo for women, and somewhat higher lor men (only Jacobsen’s study is out 
of line), indicating that the two tests are measuring the same basic apti¬ 
tude manifesting itself in two slightly different ways, or that they have an 
important factor in common but one or more others peculiar to one and 
not to the other. The lac tor analysis study carried out by the United 
States Employment Service (735), referred to early in this chapter, showed 
that the: former hypothesis is correct, and that the Placing and Turning 
I ests have prac tically identical factor composition; they are almost pure 
tests of arm-and-hand dexterity. 

1 he Manual Dexterity Lest has been correlated with tests of intelli¬ 
gence In Tuckman (878), Jacobsen (39b), and Super (unpublished study), 
linkman administered the A.C.E. Psychological Examination to high 
school students and adults, finding correlations ol .18 and .17 for Placing, 
and .29 and .2b lor Turning. Job analysis of the tests suggests that the 
closer relationship between Turning and intelligence may be due to the 
slightly more complex manual task in that test, which requires bimanual 
co-ordination of a rudimentary sort. But Jacobsen found correlations of 
. ib and .12, using 90 adult subjects. Administering the Otis S.A. lest to 
100 NY A youth, the writer obtained a correlation of .r 1 with the Placing 
Test. In anv case, the role of intelligence is negligible. 

The similar ity of fine-manual to gross-manual dexterity was ascertained 
h\ Roberts ((133). Jacobsen (39b) and bv Blum and Candee (105). Roberts 
(fypp found correlations ol . j(> and .40 between Placing and Turning, on 
the one hand, and his Penns\ 1 \arria Bimanual Woiksample (Assembly) 
Lest, a nut-and-bolt assemble task sorrrewhat finer than the Minnesota 
1 c st but grosser in its requirements than the O’Connor Tests (X = 473). 
Jacobsen tested 90 wartime aircraft industry trainees, and found correla¬ 
tions of .20 and .oh between O’Connor Finger Dexterity and Placing and 
Tinning Tests, ol .2b and .20 between Tweezer Dexterity and the two 
Minnesota subtests. Onh the highest of these correlations was statistically 
significant. Blum and C’-andec tested 130 women packers and wrappers 
in a department store with the O’Connor Finger Dexterity Test, finding 
correlations of .32 and .335 with Placing and Turning Tests. The correla¬ 
tions were reliable. With only two studies available, one with negative 
findings and one with positive, we are faced with a dilemma. But poor 
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testing conditions and other defects of procedure are more likely to pro* 
duce negative findings than positive, and the negative study was the woik 
of a beginner while the positive was that of two experienced investigators. 
It therefore seems necessary tentatively to conclude that there is some 
relationship between arm-and-hand dexterity on the one hand, and wrist- 
and-finger dexterity on the other. That the relationship is not high is 
indicated not only by these data, but also by the USES lactor analysis 
(735) which isolated two relatively independent manual factors: one 
gross and one fine. 

The role of arm-and-hand dexterity in tests of mechanical comjnchcn- 
sion was studied by Jacobsen (396) and Super (unpublished stuck). The 
latter found a correlation of only .03 between Minnesota Mechanical 
Assembly Test scores and the Placing Test, the subjects being 100 boys 
and girls aged rb to 24 employed on WA projects. 1 his is notcvoithy 
as the Asscmbl) test involves the putting together of a \ariei\ of mechan¬ 
ical objects such as a spaik plug, a mechanical bottle-stopper, and an 
old-fashioned lock. It is con finned bv 1 lai 1 cTTs (33b) lac tot anabsis <>1 the 
Minnesota Mechanical and other tests, which showed no manual dexter ilv 
lactor in the Minnesota Mechanical Assembly Test. Jacobsen found 
correlations of .21 and .1 { between Placing and Turning, on the one 
hand, and the Bennett Mechanical Comprehension Test on the' other. 
The latter is a paper-and pencil test measuring a somewhat higher oidei 
of mechanical comprehension than the assembly test; that it has no 
significant relationship to manual dexterity is therefore not surprising. 

Tests of spatial visualization have been correlated with the Minnesota 
Rate of Manipulation Test by Jacobsen (39b). Tec-garden (Kir,) and Super 
(unpublished study). The Minnesota Paper Form Board had correlations 
of .ob and .00 with Placing and Turning in Jacobsen's stuch, as compaic cl 
with one of .23 with the Placing lest in the writer's imestig.uion. I he* 
writer found a correlation of .05 between the Minnesota Spatial Relations 
Test and the Placing Test, a relationship which he has neui seen im¬ 
ported in the literature although it was to compute it that Ziegler con¬ 
structed the latter test. Jacobsen supplied the correlation between the 
Crawford Spatial Relations Test and the Placing and Turning I e-sts: . ic) 
and .11. As none of tlie abo\c relationships were statistical!) significant 
it is clear that manual dexterity and spatial visualization tests are inde¬ 
pendent. 

Ratings on success in training were used as a criterion in onlv one 
published study with the Minnesota Rate of Manipulation Pest. Phis was 
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Jacobsen’s investigation of the relationship between success in training 
m aim-ait mechanics and scores on various aptitude tests (396). These 
war-industry tiainees were rated by their instructors alter the fust two 
weeks of tiainiug, and periodically each month thereafter for the two 
or three months of training. Ratings were for se\en traits such as learning, 
speed and co-ordination, workmanship, and personal fitness for the 
occupation, rated on a five-point scale. As the specific traits had correla- 
tions with total fitness ratings which ranged from .89 to .97 the lattei 
only were used as a m'tcrion; all fitness ratings for a given individual weie 
combined; apparently no attempt was made to ascertain the reliability of 
the ratings, although the data would have permitted it. Correlations 
tanged from — .05 to .17, none of them being reliable. Either gross-manual 
dextciit\ as manifested in aircraft engine, aero repair, machine shop, and 
othei similar comses has no hearing on instructors’ evaluations of 
mechanical piomise, even though they rated the subjects for speed and 
co-ordination, or the Minnesota Rate of Manipulation l est does not meas- 
ute the t\pe of manual dc\tcrit\ sought bv instructors. As will be seen 
latei, ime -manual dc\tetit\ as measured bv the O’Connor tests had un- 
1 (-liable and low conelations -with these same ratings, the only tests 
which gi\c icliablc piedic tions ol instructors’ ratings in these courses 
being mechanical compichcnsion, arithmetic, and intelligence tests, in 
that otdei. 

Sin ( r.s.s on I hr job has been studied with electrical worksamples, pro¬ 
duction in dcpai line nt-sioi e packing and wrapping, supervisee s’ ratings 
ol etliciency 011 these saint- jcjUs, ratings of eflicicncv in pharmaceutical 
inspecting and packing, and ratings ot success in ordnance factory and 
pa pel -mill employ ces. 

The electiical :ro) k\<nnj)Ir developed bv O’Rourke (connecting a push¬ 
button. bell, and chv-crll) was used with p) boss and 37 girls aged about 
itS, emplo\ed on N\ \ piojects, b\ Steel. lkdinsk\. and Long (751). Tests 
weie administered to one-half of the subjects before the piojects weie 
initiated and to the otheis alter the worksample. T fie worksample was 
cat lied out indi\iduall\ in older to permit careful observation by the 
examiners, who recorded care in the use of directions, facility in handling 
tools and mateiials. initial adjustment to the task, and reaction to diffi¬ 
culties. Hiicf inteniews were held after the project in order to elicit 
fin ther reactions, but only the time score on the worksample was used 
in the correlations. Neither of these was statistically significant for bo\s 
(~ .oif and .10 tor Placing and Turning), but both were for gills ( 50 and 
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.35). Other data throw light on the reasons for these discrepancies. The 
boys took significantly less time to complete the worksample than the 
girls, although this was not true of the tests; the hoys had had more 
experience with electrical equipment and tools. Apparently it was amount 
of experience with electrical equipment which determined the hoys’ time 
scores, rather than gross-manual dexterity or any of the abilities measured 
by the other tests (fine-manual dexterity, spatial visualization, and 
\ocabulary), but unloi innately no test of electrical information was 
used to pi oxide a quantitative check on this explanation. The girls, on 
the other hand, had had so little experience with such equipment that 
vocabulary (ability to understand and follow the directions), spatial 
visualization, and both types of manual dexterity determined the amount 
of time they required to complete the task. Mechanical compichension. 
had it been tested, might also have played a part in the case of the gills, 
since the other relevant aptitudes weie important to then sm (ess. These 
conclusions suggest that manual dexterity and other aptitude tests are 
most likely to be valuable when selecting inexperienced woikers for 
semiskilled jobs, or for counseling inexpei ienced pci sous, wheicas caie- 
ftd evaluations of experience are likely to be more valuable with those 
who have had relevant expedience. This conclusion applies onlv to 
initial job adjustments in semiskilled work, however, for that is what 
the worksample tested; as Rlum and Gander’s study of packets and 
wrappers (106) showed, skills that are impoitant in initial adjustment 
sometimes plav no pai t in long-teim success, othei aptitudes emerging .is 
the important ones after some experience has bee n acquired. 

In their fust study of department store junkers and wraj)j)ers Rlum and 
Candee (tor,) tested 38 permanent emplovees of one department store*, 
together with emplovment-seivice applicants subsequentlv employed 
by the stole for whom criteiion data became available. 1 lie oiteiia 
used were production records and supervisors’ ratings. For the loimei, 
the average daily number of packages wrapped dining the month of 
December, when employees woi k most nearly at their capacity, was 
used; its split-hall reliability was .88. The supervisors’ ratings were 
those routinely made on a font-point scale, and consisted of an overall 
efficiency rating phrased in teams of recommended continued emplov- 
ment and seasonal rehiring. No data are presented concerning the* 
reliability of the ratings, which included none in the “inefficient” or 
lowest category. 

The correlations between production records and Placing and Turn- 
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ing Tests were .35 and .27 for seasonal employees, and .21 and .of) for 
permanent employees. Evidently arm-and-hand dexterity plays some 
part in initial job adjustment in packing, but the skill requirements air 
actually low enough so that experience erases the effect of differences in 
aptitude. When supervisors’ ratings were used as a criterion no significant 
differences were found between supetior and inferior seasonal nor be¬ 
tween superior and inferior permanent employees, although the perma¬ 
nent employees were rated superior to the seasonal employees and made 
higher test stoics. As the seasonal employees were considered especially 
good that year, although not actually superior to the general population, 
ilium and Candee concluded that experience must affect test perform¬ 
ance 1 . While this is perhaps true, it cannot be considered as having been 
pro\ed, for there was no prc-emplo) ment testing of the experienced 
group and no post-employment testing of the inexperienced group; the 
higher scores of the experienced group nrav ha\e been clue to self¬ 
selection. through the quitting of satisfactory experienced workeis who 
found that relative lack ol manual dexterity required them to put 
lor th a dispi opoi t ionate and unsatisfactory amount of effort in order to 
keep up. 

In their second study (rob). Ilium and Candee tested comparable 
groups in another department store, and used similar criteria, but the 
I inning 'I cm was omitted, \gain there was a moderate but significant 
relationship between arm-and-hand dexterity and production in the case 
of packets who handle 1 large items (.“7b but not in the case of wrappers 
who handle small items and make change. Again there was no relation¬ 
ship between test score's and supervisors’ ratings of permanent employees, 
but seasonal employees given the highest ratings tended to make slightly 
higher test scores than those ghen lower ratings. The general conclusion 
is the same as lor tire fust study; arm-and-hand dexterity plays a part in 
the 1 initial job adjustment of packers, whose movements are gross in 
nature, but practice minimizes its effects; in the case of wrappers, whose 
work iirvobes somewhat finer but still gross movements, arm-and-hand 
dexterity as measured by the Minnesota test played no part. Neither 
did linger dexterity as measured by O’Connor’s test; perhaps a new test 
of air intermediate degree of fineness, involving waist-and-finger move¬ 
ments with objects the si/e of the Minnesota Rate of Manipulation Test’s 
discs, would have* produced positive results. 

Another study of the same type was made by Ghisclli (288) with 42 
seasonal wrappers who were rated for both quality and quantity of 
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output. The ratings were combined, and proved to have no significant 
relationship to Placing and Turning Test scores (—.10 and — .02). Data 
tor finger dexterity were approximately the same. Ghiselli (28(1) also 
worked with pharmaceutical inspector-packers, whose tasks consisted oJ 
filling, stoppering, examining, labeling, cartoning, and packaging con¬ 
tainers of fluids, powders, and pastes, fob analysis suggested that arm- 
atid-hand dexterity and eve-hand co-ordination should be among the 
important chaiactei istics in pci forming the work. The Minnesota 
Rate of Manipulation 'Lest was therefore among those included in the 
battery. Production being difficult to measure because of variations in the 
nature of the work, a rating scale was devised to measure the traits sug¬ 
gested by the job analysis, and the forewoman in charge of the work 
rated each of the 26 girls. In addition, the supervisor of the finishing 
room rated each on overall value to the organization. Reliability of tin- 
ratings was checked by correlating the composite forewoman's ratings 
with the supervisor's o\eialI rating: the coefficient was .72. 'I he two 
ratings were therefore combined to seive as ciiterion. The c01 relations 
between ciiterion and Placing and Turning Pests weie — .2 j and — . jo 
(negative because the scores are in terms of seconds used to perfonn the 
task). Of the other factors measured spatial visualization was the most 
important, more so than manual dexterity; clerical perception was im¬ 
portant, but less so than manual dexterity; and some of the spatial and 
eye-hand co-oidination parts of the MacQuarrie were also valid. Gliisc Ill’s 
preliminaty job analysis therefore proved to be a sound one*. It is in¬ 
teresting that the specific factors in the Turning Test made it more valid 
than the Placing, even though tliev measure the same basic factor: further 
evidence of the desirability of using custom-built test batteries, and even 
custom-built tests, for selection purposes. 

It is also noteworthy that, although the* manual operations in the* 
pharmaceutical job appear to have been more like those oI the wrappers 
than those of the packers in Blum and Gantlet*’s studies, the dexterity 
test had less validity ior wrapper selection than lor packe r selection, and 
less for packers than lor pharmaceutical inspector-packers. In Blum and 
Candee’s studies manual dexterity had some predictive value for initial 
job adjustment, but no validity for experienced workers, while in 
Ghisclli’s pharmaceutical study no distinction 011 the basis ol experience 
was made. Herein perhaps lies the explanation of the apparent dis¬ 
crepancy: Ghiselli’s group is not described in terms of specific experience, 
but the general statement is made that both the rate of turnover in the 
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department and company morale were high. If the group contained as 
great a range of experience as these facts suggest, and il manual dexterity 
is a selective factor on the job, then the range of manual dexterity was 
probably greater in Ghiselli’s sample than in Blum and Candee’s (the 
published data cfo not permit comparisons). A difference in aptitude 
sampling such as this woidd lesult in a higher correlation coefficient for 
Ghiselli’s study, even though the role of manual dexterity were really 
identical in the two occupations. The final judgment would seem to be 
that Blum and Candee’s conclusions are correct, but that the role of 
manual dexter iiv h somewhat greater than they found it to be. 

A final stud\ of the role of manual dexterity in packing jobs is that 
canied out by the United States Employment Service and cited bv Stead 
and Shartle (750:217-227). They administered the Minnesota test to 
can packers. 30 merchandise packets, and 41 inspector-wrappers (the 
jobs are not further described). A production criterion was used for the 
fust two jobs: average number of cans packed per hour, and ratio of 
time estimated as needed to complete a unit of work by time-and-motion 
studv men to time actuallv used to complete the unit; a rating was used 
lor the last-named job. 1 he conelations for the Placing Test were .35, 
.1 p and —-.op; for the I inning Lest, .22, .11, and .01 respectively Onlv 
for the can packets are the conelations high enough to be significant, 
and the relationship is the opposite of that anticipated (true also for 
finger dextenty): the slowei or less dextrous tended to have the greater 
output. For the inspector-packers, whom Ghiselli had considered most 
liki lv to resemble* his group, no relationship was found. The merchandise 
packets closely lesembled Blum and Candee’s department store packets 
in operations pci lor med: the* correlations are lower in this study than 
in the*i 1 s. failing more detailed data on the I NKS study, reconciliation 
of its findings with the others seems impossible; if enough facts were 
available, good masons ioi the discrepancy would no doubt be found. 
Pc i haps tin USES studv merely re versed signs. It therefore seems wise to 
abide bv the conclusions drawn from the studies which have been re¬ 
ported in more detail. I he USES study also included pull-socket as¬ 
semblers, put-in-coil girls, and cafeteria counter and floor girls, for none 
of whom the test had validitv (r’x — — .15 to .19). 

A different type of occupational group was studied by McMurry and 
Johnson (500), who administered the Minnesota dexterity test to 768 
women being hired by an ordnance factory. Scores were validated against 
ratings of 587 who remained long enough to be rated. The reliability of 
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the ratings was apparently not checked. Distribution among jobs is 
illustrated by the lact that there were 97 welders, j.jo assembly workers, 
and 33 inspectors. No validities were reported, however, lor the Rate of 
Manipulation Test. 

The paper-mill employees studied by Jurgenscn (413) were men hired 
as converting-machine operators, whose work consisted mostly of remov¬ 
ing a specified number of tissue-paper sheets from the machine, raising 
the top sheet to insert advertising material, and placing the package of 
sheets on a conveyor. All Go weie right-handed high school graduates 
between the ages of 18 and 31. The criterion was a combination of three 
supervisors’ ratings, the reliability of which was .75. Placing and Turn¬ 
ing Tests were both administered, plus some variations which included 
placing and turning with both hands simultaneously. Validity coefficients 
were: Placing .325, Turning .455, Right-Hand Placing-Turning .57, 
Simultaneous Placing-Turning .33. These findings indicate not only that 
the Minnesota test has predictive value for this type of semiskilled factory 
work, but also that motion study can be valuable in suggesting variations 
in the test which increase its validity for specific operations. It is ugiet- 
table that Jurgenscn did not also utilize an output criterion, the gieater 
objectivity of which (if not affected by slow-down, poor morale, etc.) 
would provide a better index. 

Occupational differentiation by means of the Minnesota Rate of Manip¬ 
ulation Test has been checked by a number of imestigators and lot a 
variety of jobs. Blum and Candce (10G) found that satisfactory experi¬ 
enced department store packers and wrappers made better scores than 
the general population on the Placing Test, but the seasonal workers 
who were considered an exceptionally good group did not. Ghiselli 
(286) reported that pharmaceutical inspector-packers stood at the cjGth 
percentile on the Placing and at the 91st on the Turning Tests when 
compared to the general population. In the USES study (750) pull-socket 
assemblers, put-in-coil girls, and can packers exceeded the- 7.7th percentile 
of the general adult population in Placing: in Turning the merchandise 
packers displaced the can packers. Merchandise packets, cafeteria counter 
and floor girls, and inspector-wrappers were in the normal range in 
Placing, with can packers taking the place of the merchandise packers 
in Turning. Teegarden provided data for groups of from 2(> to 123 
semiskilled workers in a study previously cited. Of these occupational 
groups only the assemblers, inspectors, and testers stood at about the 
third cjuartile, with women packers and wrappers slightly below it, on 
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both Placing and Turning 'Tests; male packers and wrappers, helpers 
in skilled trades, factory hand operatives, machine operators, and women 
clerks were about one sigma above the adult mean in Placing, the 
women machine operators and packers and wrappers being there also in 
Turning; truck driveis and chauffeurs, truck loaders and helpers, male 
sales cleaks, restaurant workers, domestic workers, and manual laborers 
all being in the normal range. It has already been noted that in the 
MESRI studies butter and food packers and wrappers, bank tellers, and 
male* office clerks ranked higher than the third quartiic on the Placing 
Test, while women office clerks, mirror bank officials, minor executives, 
semiskilled workers, stenographers, typists, and garage mechanics, were 
in the average range. From these findings it seems legitimate to conclude 
that arm-and-hand dexterity as measured by the Minnesota test is im¬ 
portant in packing, wrapping and inspection jobs and in gross-manual 
assembly and machine-operation jobs; the predictive value of the test 
depends somewhat, however, upon the specific factors in the job and the 
degiee to which they also are tapped by the test. For this reason some¬ 
times the Placing 'Test, sometimes the 'Turning Test, and sometimes other 
variations such as those tiied by Jurgenscn (J13) with the Minnesota 
materials and by others with custom-built pegboards, will have the* most 
predictive value and so be most helpful in selection or counseling. 

/oh sntisfac tmn , in the case ol the Minnesota Manual Dexterity Test 
as in that ol the Minnesota Clerical Test, has apparently not been a 
subject ol investigation. 

r sc of the Minnesota Rate of Manipulation Test. 'The Minnesota 
Manual Dexterity or Rate of Manipulation Test has been found to he use- 
lul pnmatilv in connection with semiskilled occupations in which skill in 
aim-and-hand movements seems, in job analyses, to be important. It has 
not been iound valuable in skilled trades, in which understanding of the 
processes involved is more important than individual diflerences in the 
manual dexterity with which they aie executed. Even in the grosser 
manual jobs such as packing and the assembly of large parts, differences 
in skill which are found to exist before employment play a part primarily 
in initial adjustments to the work rather than in long-term adjustments; 
practice in the specific job operations appears to reduce the effect of pre¬ 
employment dillercnces to the zero point. It may be that these differences 
play a part at this stage which current studies have not brought out, by 
making the* maintenance of adequate production so easy as to render the 
work satisfying, or such a strain that it becomes subtly unbearable and 
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makes the worker quit or continue in a state of undiagnosed dissatisfac¬ 
tion. In the light of present knowledge, however, this test seems likely to 
be useful in counseling inexperienced persons concerning the choice of 
packing and assembly jobs. It is even more likely to find use in selection, 
when quick adjustment to routine work is desired, than in counseling. 

In selection programs local norms should be used, and in the initial 
studies of the test with a given job in a given plant variations in the 
technique of the test-task should be tried. The test then taps specific 
factors in the job along with the basic or group factor which should be 
its principal source of validity. The nature of these variations is sug¬ 
gested by job analysis. The \alidity of the test is increased b\ this 
method, for the initial job-adjustment period; its long-term adjustment 
validity may be decreased by the emphasis on developed rather than on 
latent skills, but at this stage of our knowledge that is only a subject for 
speculation and investigation. 

It is doubtful whether this type of test has any place as a directional 
instrument in a school counseling program. If experience erases the 
effects of normal individual differences in this type of dexterity, then 
it is the function of education to provide such experience in appropriate 
cases (those of persons who may enter such work, as suggested by intelli¬ 
gence, interest, and socio-economic status). The test will not be useful 
in providing data for the making of decisions concerning the choice ol 
semiskilled occupations. It may, on the other hand, give some insight 
into the assets and liabilities with which a student enters upon new 
experiences. 

In employment counseling , whether at the end of an educational pro¬ 
gram, in an adult guidance center, or in an employment service, the test 
should be of more value, for the question of initial job adjustments of 
workers inexperienced in jobs requiring arm-and-hand dexterity is both 
more common and one on which the Minnesota test throws some light. 
Since the occupational norms are based on small groups they must be 
selected with a full understanding of the particular sample and employed 
tentatively, but some facts cautiously used are better than none at all 
when decisions have to be made. When the University of Minnesota’s 
form is used, the MESRI norms arc still the best; when the Educational 
Test Bureau’s form is used, then Teegardcn’s data will probably be 
found most helpful. In either case the norms should be thought of as 
merely suggestive. 
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Finally, it is pertinent to ask whether both Placing and Turning Tests 
should be used, or only one of them; and if the latter, which one. In a 
specialized battery for semiskilled jobs both should be used, because of 
the higher correlation of the Placing Test with some gross-movement 
jobs such as department-store packing, and of the superior validity of 
the Turning Test for some finer-movement jobs such as packaging drugs. 
In more comprehensive test batteries, in which there is not sufficient 
testing time lor the refined investigation of each area, the fact that the 
factor composition of the two tests is identical means that one of the 
tests should be sufficient lor survey or screening purposes. In such situa¬ 
tions the test to be used should depend upon which is likely to be more 
closely related to jobs the examinee may consider or be considered for; 
in the absence of data for the making of such judgments, the Placing 1 est 
can probably best be used, together with a wrist-and-finger dexterity test 
to tap the other extreme ol fineness. 

The O'Connor Fin^ei and Tweezer Dexterity Tests (C. II. Stoelting and 
(io., l 92S) 

The O’Connor Fingci and Twce/er Dexteiitv I ests vveie developed in 
the middle i<)2o’s while O’Connor was employed in the West Lynn woiks 
ol the General Ele ctric Company (37 j). He was concerned with the- 
selection ol women lor electric-meter and instrument assembly woik, 
and de vised these tests lot that purpose. Similai tests had previously neen 
described by Whitman (<)2;;)- who used them with children. They have- 
since been tried out on various types of workers, particularly in the 
Minnesota Lmplovment Stabilization Research Institute. 

1 />/>Ih ability. I he tests were designed for use with adults and with 
older adolescents ol post-high school age; thev were 4 standardized on such 
groups, and 1 estandardized on similar groups by the MESRl project 
They are widelv administered to adolescents, but this writer has seen 
no studies ol their applicabihtv to these younger groups. The fact that 
physical maturity comes somewhat earlier than mental has seemed to 
warrant the use of this dexterity test from age 13 or 14 on (94), but it 
lias not actually been demonstrated that this specific type of dexteiitv 
matures early. We have seen that the assumption of early maturation 
proved misleading in the case of the Minnesota clerical and manual 
dexteiitv tests: it may be equally misleading in the case of this insmi 
ment. In the* absence of data on this question, one should proceed can- 
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tiously with the use of the O’Connor tests with high school boys and 
gills: but there is probably not much danger ol being misled as a result 
of age-changes alter the last two years of high school. 

Candee and Blum (133) have reported that age in adulthood and work 
experience have no ellect on the scores of the O’Connor tests. 

Content. The Finger Dexterity Test consists ol a shallow tray beside* 
a metal plate in which there are 100 holes arranged in ten rows ol ten 
holes each (the only leadily available form, Stoelting’s, is made ol cli 1 - 
ferent materials). Each hole is large enough to hold tlnee metal pins one 
inch long and .07 inches in diameter: the* holes are spaced one-hall inch 
apart. The Twcc/er Dexterity Test is sometimes the opposite side ol the 
boards used lor the Finger Dexterity Test; again the metal plate has 100 
holes in it, but these are only slightly larger than the pins, allowing one 
to be placed in each hole. A pair of 00 gauge twee/eis aie used in this 
test to pick up the pins. 

Adinnnst)ation and Stonng. The Dexteiity Test is administered with 
the subject seated at a table* ol standard height (;^o inches), with the* pin¬ 
board about a loot lrom the edge ol the* table, the tray on the* side* ol the* 
la\ored hand, placed at an angle ol about ejo degrees to the* subject I lie* 
dnections ate clear except lor one point. I he* O’Connoi tests aie* incoi- 
rectly gi\en by main ps\chomc*ti ists because thc*\ do not le ad the in¬ 
structions carefully enough to reali/c* that if a right handed subject \u*h* 
tej start in the top-lelt corner and fill the holes towaid himself he* would 
fill columns, which go \>ntually , up-and-down, rather than row s. Flu- 
examinee should actually begin at the far comet (top-lelt for a light- 
handed pel son) and fill the holes ol the* top )ow aoos\ to the otlici 
(toj)-right lor a right-handed person) cornet, then begin to fill the* holes 
in the second tow* in the same manner as the fi 1 si. then the* holes ol tin* 
til it el row, etc. 

I11 the Finger Dexteiitx lest the* subject picks up three pins with his 
preferred hand and places them in each hole; in the Twee/ei Dexterity 
Test he picks up one pin at a time and places it in its hole. The scoie is 
normally on the basis of time, with a small correction of the second half 
for practice on the first half; some recent studies, including those of the* 
USES, use simply the total time recjuiied, which is ptobablv sufficiently 
refined for practical purposes. The time reejuired saiies from 8 to 17 
minutes for the Finger Dexterity Test, and up to about 10 minutes for 
the Tweezer Test. Accurate timing is impoitant and reejuires eithci a 
stop watch or a watch with a sweep-second hand. 
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Xorms. Although O’Connor presented adult norms in his original 
report ol his work (371), the most representative and generally used 
norms are those ol the Minnesota Employment Stabilization Research 
Institute (30b). These are lor the standard sample of 500 employed 
adults, supplemented by averages for small groups of persons in a variety 
of occupations, most of which are unfortunately not the types for which 
these tests can be expected to be useful. Means and sigmas are available 
lor more pertinent occupations in other studies (103,750) discussed 
below, but unfortunately the scores in these are given in terms of total 
number ol seconds used rather than in terms of O’Connor’s correction. 
Perhaps in due couise the work of the USES will make it advisable to 
use the total time store and their norms; in the meantime, the corrected 
store anti the MESRI norms are best. As no published manual is avail¬ 
able, the Minnesota norms a»e reproduced in Table 14. 
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Means lor tut upation.il groups which might be expected to make high 
stores on these 1 tests, togethei with those ol certain others included for 
sake ol contrast, aie gi\cn in I able 15 and can serve as a suggestive 
guide in the use ol O’Connor test results. 

It should be noted that women tend to do better than men on this 
as on other tv pcs ol dexterity tests, and that the only occupations for 
which these tests have been shown to have any clear-cut value are women 
instrument assemblers, bank tellers, office workers, manual-training 
teachers, arrd clraltsmen: as will be brought out below, these data help 
one to understand tire test, but they arc hardly enough to serve as norms. 
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Table 15 

AVERAGE FINGER AND TWEEZER DEXTERITY SCORI S <11 
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* Data not available 


Standardization and Initial Jalidation . O’Connor standardi/ed the 
test on 2000 women applicants lor factory employment and an equal 
number of men in the General Electric plant at West l,um, Massachu¬ 
setts. The Finger Dexterity Test was administered a number ol times to 
the same woikers, with the finding that the second trial was somewhat 
better than the first, and the fifth trial showed little fuither impio\ement 
over the fourth. Retest reliability for the first and second trials was .Go 
on the Finger Dexterity Test, considerably lower than those obtained br¬ 
others. The original validation of this test was on a group of women 
applicants who Avert* tested when interviewed for employment and hiied 
for assembly work. Hines and O’Connor (“7 j) reported that ;;<> percent 
of I hose in the lowest cjuarter left the company before S months had 
elapsed, as compared with onl\ G percent ol those in the top cjuarter; 
this seems impressive until it is realized that gb percent of one-quarter 
of 36 (the total number of cases) is slightly more than one-third of q. 
that is, three-ancl-a-fraction persons, and that G percent of one-quarter 
of 3b is approximately \\ 7 of 9, or a little more than one-half of one 
person. Just how three-and-a-lraction persons, and a little more than 
one-half a person, can fail is something that even some eminent test- 
construction specialists have failed to ask (590:237). The repor t does not 
exactly strengthen one’s confidence in the original A\oik Avith the* test, 
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however ingenious the test idea. No data were published by O’Connor 
dealing in specific detail with the 'Tweezer Dexterity Test. 

Reliability. As we have seen, Hines and O'Connor (374) originally 
reported the retest reliability of the Finger Dexterity Test as, .Go. Blum 
(103) retested G.j employment applicants, obtaining a much higher 
coefficient of .89; he also reported an uncorrected split-half reliability 
of .77. Split-half reliabilities for the same test have been reported by 
Darley (187): these aie (corrected) .93 and .90 for samples of 475 men 
and 215 women. Appaiently the test is reliable even according to the 
retest method. No reliability data have been located for the Twee/or 
Dexterity 'Test, the abene investigators having made the seemingly wan 
1 anted assumption that the two tests cannot differ much in this respect. 

Validity. Because of their early publication the O’Connor dexterity 
tests have been used in a number of studies. Even though many of these 
had only an indirect or \er\ partial interest in the nature and validity 
of the lest, they do, taken as a whole, throw considerable light on its 
\ aliditv. 

Conelations noth othn tests ha\e been computed for the usual variety 
of measures. The Finger and Twee/er Dexterity Tests have been found 
to have interc or relations of .17 by Jacobsen (39G) with 90 war-industry 
tiainees as subjects, .19 by Blum (103) who tested 119 women factory- 
employment applicants, .47 by 'Thompson (824) with 35 dental freshmen, 
.33 by the Minnesota project (187) with a heterogeneous group of 
women and .r,G lor a similar group of men, and .57 by Flan is (311) with 
a group of GO dental students 59 of whom completed the four year 
course. As Blum’s and Jacobsen’s results arc based on factory workers, 
the others’ on piofessional or mixed groups, it is probably safe to con- 
(ludc that the correlation is approximately .50 in heterogeneous groups 
and 1 ess than .20 in homogeneous groups. 

Correlations between O’Connor dexterity tests and intelligence tcMs 
have been reported In Harris (341) for dental students, using the Otis 
S.A. lest. The coefficients are —.01 and .or5. 

The relationship with arm-and-hand dexterity is perhaps of most 
interest. Finger Dexterity was found to correlate to the extent of .21 
and .42 with the Minnesota Placing Test by Jacobsen (396) and In 
Blum and Candee (ro5), .06 and .335 with the Turning Test. F01 
Twee/er Dexterity correlations of .26 and .20 with Placing and Turning 
Tests were repor ted bv Jacobsen (39G). With one exception, Jacobsen’s 
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correlations are not high enough to be significant, while Blum and 
Candee’s are. As was brought out in the discussion of arm-and-hand 
dexterity, it seems likely that the latter’s results should he accepted until 
more conclusive studies are made. The two types of dexterity should be 
thought of as related but distinct aptitudes. 

Correlations with tests of spatial visualization are more numerous, 
Andrew (21) reported correlations of .28 and .31 between the Minnesota 
Spatial Relations Test and the Finger and Twee/er Dexterity Tests, 
based on 200 women clerical workers. For the Revised Minnesota Paper 
Form Board Jacobsen (39b) and Thompson (824) reported correlations 
of less than .15 with war-industry trainees and dental .students. Jacobsen 
also found no significant relationships with the Crawford Spatial Rela¬ 
tions Test (.22 and .it); in a more heterogeneous group they might be 
higher. Harris (341) reported correlations of —.02 and .15 with the 
Wigglv Block. Evidently ability to visualize space relations plavs no part 
in wrist-and-finger dexterity. 

Wrist-and-finger dexterity has rarely been correlated with nic( hann at 
(nmprchcnsion , probably because ol the* anticipated low lelationship 
Jacobsen (39b) confirmed expectations with coefficients of —.08 and .1 j 
with the Bennett Mechanical Comprehension Test. 

Success in training has been investigated for an electrical woiksamplc. 
aircraft mechanics, power-sewing-machine operation, machine-tool oper¬ 
ation, fine arts, and dentistry. 

The study of the electrical •worksample (751) lias alreadv been de¬ 
scribed in connection with the Minnesota Rate of Manipulation Test In 
it the Finger Dexterity Lest had validities oi .08 loi boys and 3-, lor 
girls, while* those for the Tweezer Dexteritv Test were .18 and ,j2 (other 
data show that the signs should be* negative*, being lor time* scenes). As 
has aheady been seen, there is reason foi believing that the bovs’ v\ork- 
sample scores were a reflection of degiees of experience while the* gills’ 
were the result of differences in aptitudes, and that in such woi k vm ist 
and-finger dexterity is likely to play a pait only in the initial adjust 
rnents of novices or in output differences in ecjually experie nced wotkers. 

The investigation of factors in success in aircraft mechanic training 
was also discussed in the section on the Minnesota Rate ol Manipulation 
Test. Jacobsen (39b) found only two significant relationships between 
O’Connor dexterity tests and instructors’ ratings of fitness for the* occupa¬ 
tion: these were between Finger Dexterity and ratings in aircraft elec¬ 
tricity (.31) and Tweezer Dexterity and ratings in aircraft instruments 
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f-3 2 )- As ^ 1C other eight coefficients ranged from —.02 to .22, and there 
is no apparent logic underlying the different results, the writer is in¬ 
clined to consider the two statistically significant correlations the prod¬ 
ucts of chance. In any set of correlation coefficients some will appear 
significant simply as a result of chance factors. Such a conclusion is 
forced also by the illogic of those who take most time to complete a 
speed test being the best students (unless Jacobsen reversed signs). 

High school girls learning power-sewing-machine operation were 
studied by Otis (579), who used time taken to complete a series of work- 
samples and quality-ratings of the same tasks as criteria. The two 
criteria had an intercorrelation of —.17^.13, from which it might be 
concluded that there was no relationship between speed and quality, 
but which may, in the absence of reliability data, only prove that one 
or both of the criteria were unreliable. The speed criterion had a cor- 
1 elation of .27 with Finger and .]f> with Tweezer Dexterity; the quality 
criterion, .20 and .07, neither of these latter being statistically significant. 
These results suggest that at least the speed criterion was reliable, and 
show that those who were fastest on the test tended to be most rapid 
on the task. 

In the study of machine-tool trainees Ross (G51) administered the 
O’Connor Finger Dexteiitv, Minnesota Spatial Relations, and O’Rourke 
Mechanical Aptitude Tests and related them to grades in training, estab¬ 
lishing critical scores but not obtaining any indices of degrees of telation- 
ship. 

Students of fine mis were tested with the O’Connor dexterity tests and 
the Minnesota Paper Form Board by Thompson (82 \), the number of 
students being ;,o. Corielations with point-hour ratios were .21 for 
f inger and .oS lor Twee/er Dexterity, neither of which was clearlv 
significant. '\ his finding can pel haps be discounted, however, since 
giades in line aits aie probably not the most appropriate of criteria; a 
study using ratings of the ejtialitv ol artistic craftsmanship, made by 
expons and clucked foi teliabilitx, might Meld cpiite different results. 

Students of dentistry haw been studied b\ Douglass and McCullough 
(2c>tS), Harris (3 pi. [ones (unpublished stud\), and Thompson (82 j). 
In the first-named study a \ariet\ of tests were tried at the Dniversits 
of Minnesota cner a period of several wars, with average grades in 
dental school the criterion. The results \aried somewhat from one sample 
to another, but in a typical gioup of 83 students the correlations between 
grades and Finger and Twee/er Tests were —po and —.30. In Harris 
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preliminary study of 30 dental Ireshmen at I nils lust u*ai auk 

the criterion, and the correlations with linger and Twee/ei Dcxtciitv 
Tests were — .395 and — .36. These are the only studies repotting valitlit a 
for grades in dentistry; it seems rather surprising that so specific an 
aptitude should show a substantial relationship to such a multi-factorial 
criterion as college grades, especially as the first tAVO years of dental 
training are more academic than manual or practical. And Harris’ more 
definitive study, in the same school and Avith the same tests, based on 
(i6 students with both first- and lour-\ear grades as criteiion. Melded 
validities of onlv — .10 and —.17 lor first-year grades and .15 and —.10 
for four-year grades (lor the numbers in question, the coefficients Avould 
need to equal .31 to be significant at the 1 percent lex cl). The Otis S.A. 
Test, on the other hand, had validities of .33 and .33. Thompson also 
correlated O'Connor’s tests xvith freshman and lour-ycar grades, for one 
group of 33 freshmen and another of ~jo seniors in dentistry, finding 
validities of .01 and .01 for Finger and —.07 and .13 for Tawc/ct Dex¬ 
terity. E. S. Jones, in conversation Avith the Avriter, has also icpoited 
obtaining negligible correlations betAveen O’Connor tests and dental 
grades at the IhmersitA of Buffalo. The cx'idcnce is now \ei\ stionglx in 
faAor, therefore, of a lack of piedictive value* in the* O'Connor tests lot 
grades in dental school, the statements of O’Connor (371) and the 
guarded suggestions of Bingham (91:28} and uSb) to the* conn ai\ not¬ 
withstanding. IIoAveAcr, their logic seems so good that this Avriter too 
Avould not be surprised to see substantial validities for these* tests Avhen 
correlated with a practical criterion, e.g., reliable ratings, such as might 
be made by patients, of skill in clinical Avork. Douglass and McCullough’s 
(208) correlations Avith laboratory grades, — ..43 and —.33, are promising. 
Other studies of this type, and consistent validities, have as yet not been 
reported; it will be seen that other eAidence of the tests’ \alidity foi 
dental training exists. 

Success on the job has been the subject of investigation with Avatch 
assemblers, electrical fixtures and ladio assemble)s, department store 
packers and wrappers, pull-socket assemblers, put-in-coil girls, and can 
packers. 

In a preliminary study of watch assemblers Candee and Blum 033 ) 
administered the Finger and Tweezer Dexterity Tests to 20 Avomen 
workers selected as superior and 17 selected as mediocre by their fore¬ 
men. The difference between scores of the two gioups on the Finger 
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Dextcritv Test approached significance (D/jd = 3.18); no such difference 
was found for the I wee/er Dexterity Test (D/jd = j.oi), but this latter 
lest differentiated employees from a group of applicants better than did 
the former. Critical scores of 7 'jjo" and 5'30" were established for Finger 
and Twce/er Dexterity Tests. Two years later these workers were followed 
up by Iilurn (ioyp. None of those who had been rated superior had been 
discharged, as contrasted with 18 percent of the mediocre workers: the 
critical ratio was 2.00. The salary ratios (average weekly piece-rate earn¬ 
ings over a three-month period divided by the average for all employees 
and expressed as an index with S20 per week equal to 100) of the two 
gioups were 110 and 93, which ga\e a critical ratio of 3.7; apparently 
the* foremen's judgment ol superiority was generally good. Although the 
gioups were so small as to make conclusions necessarily tentative, the 
11 end was clear lv lor superior workers to make better scores on the two 
tests. 

In a subsequent studs Blum (10;;) used length of employment, fore¬ 
men’s ratings, and salary ratio as the criterion. The salary ratio was that 
described above; length of employment was divided into “less than one 
week” (falime gioup), 011c* week ter four months (unsatisfactory group), 
lour months to one year ( a moderately proficient group), and more than 
one \c*ai (permanent and effective employees); foremen’s ratings wete 
on a five-point scale ranging lrom “excellent” to “terrible.” The first two 
criteria ate objective and hence reliable: the last had a reliability coeffi¬ 
cient ol .(>0 lor pj workers re-rated after the lapse of more than one ycai. 
which is quite high in view of changes in the worker during such a 
period. The subjects were women applicants for factory work at a branch 
of the New A Or k State Employment Service; 137 constituted the tested 
group, and another 8 j who also were selected solely on the basis erf an 
interview hut weie not tested were used as a control group. Most of the 
group had had industrial experience blit none had worked in watch 
factories; all were white, and 90 percent were between 20 and 25 years 
of age, with a range from 18 to .jo. The factory at no time had knowl¬ 
edge of the* women's test scores. The Finger and Tweezer Tests were 
administered before hiring; scores obtained were time in seconds, quality 
ratings (reliability for lunger Dexterity equaled .89), and absolute and 
relative improvement (reliabilities of . rg and .26). It is worth noting that 
rime and qua!itv scores had intercorrelations of .14 for Finger and .71 
for Tweezer Dexteiitv I ests. and that the two qualitv ratings had an 
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intercorrelation of .26; in view of the reliability of the Finger Dexterity 
quality ratings and a restricted range of T weezer quality ratings this is 
difficult to explain. 

Quality ratings yielded no significant relationships with length of 
employment or salary ratio, with the exception of Tweezer Dexterity and 
the former; whereas 64 percent of those who received above-average 
quality ratings worked for four months or longer, only 39 percent of 
those rated below average on quality of Tweezer Dexterity remained on 
the job that long (D/ad = 3.6). On the other hand, both Finger and 
Tweezer dexterity quality ratings yielded reliable contingency coefficients 
with foremen’s ratings (.50 and .24, with .84 the maximum possible). 

Time scores on both tests showed significant differences between less- 
than-seven-day employees and those who remained on the job for moie 
than a year (I)/ad = 4.3 and 2.5), with differences approaching signifi¬ 
cance when the former group were compart'd with the four-month to a- 
year group. Correlations between the two tests and salary ratio were .2b 
and .32 (other data show that the signs should be negative, being time 
scores); when the two tests were combined, the \alicliiy was .39. All 
three have some statistical significance. The relationships with foiemeu’s 
ratings were' not reliable. 

A luither step consisted of applying the previously established c 1 it ic a 1 
scores (133) to this new group ol woikeis and to the 8 j controls who weie 
not tested. There was again no lelationship with lou men’s latings. Ol 
the group who “passed” both tests when the critical scoie was applied, 
only 7 percent were discharged in less than one week, while 37 percent 
were employed for more than a year; lor the no-test gioup the pel cent ages 
were 23 and 41; foi the group who “failed” one 01 both tests thex weir 
24 and 28. Appropriate critical 1 alios were cle arly significant. Salary latios 
were 91, 88, and 73 for the three categories just utilized, with the diflei- 
ences again significant. 

Finger and Tweezer Dexterity Tests aie clearly useful in selecting 
successful watch assembly workers when criteria such as turnover and 
output (salary ratio) are used. 

Electrical assembly workers and one type of packer were tested by 
the USES Division of Occupational Analysis (750): pull-socket, asscrti- 
blers, put-in-coil gills, and can packers . The groups were 16, 18, and 43 
in number, presumably all women although sex is not specified for two 
groups. The criteria w r ere number of pull sockets assembled per horn, 
ratio of time consumed to complete a unit of work to standard time set 
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by time and motion study men for pul-in-coil girls, and average number 
ol cans packed per hour. Only the Finger Dexterity 'Lest was administered 
to all three groups, with validities of —.09, —.25, and .26. Put-in-coil 
girls also took the 'Tweezer Dexterity Test, the validity being —.57. It is 
interesting to note that the can packers, for whom the correlations of 
time scores with the Minnesota Manual Dexterity Test and the USES 
Peg board were negative, have a positive correlation with time scores on 
the tests of wrist-and-fmgei dexteiity. This suggests that some types of 
assembly work tend to letain workeis who are fast in gross movements 
but slow in fine, whereas others letain workers who are dextrous in both 
types ol operations: piesumably the latter would tend to pay more and 
to be more selective. Put the finding may be a reflection of a less selective 
employment policy lather than ol less stringent work requirements, for 
the numbers aie small and may ha\e been employed in only one com¬ 
pany, and the spread of scoies is much greater for the can packers than 
lor the other assembly workers (sigma equals approximately 30 as 
opposed to 1S seconds). Foi one ol these assembly jobs the O’Connor 
dexteiity tests do cleat iy ha\e piedictive value, apparently that requir¬ 
ing the finest w 1 ist-and lingei movements: for another, somewhat grosser, 
assembly job it stems to have none (neither did the Minnesota Manual 
Dexiemv l est): lot the thin! and grossest manual job it has low validity 
ol a negative soit. slow test workers tending to be last task workers. The 
List two relationships may be the result of the operation of chance 
lac tots, but the Inst is too consistent to be the result of chance. 

Plum and Candee used the O’Connor Dexterity Tests in their studies 
(T05.1t »C>) of depai t nient stoic packers and wrappers , described under 
the Minnesota test, finding zero relationships between these tests and 
output or supervisor’ *atings. 

()((upalional differentiation on the basis of wrist-and-finger dexterity 
is brought out lw MFsRI data picscnted earlier in Table 15. Office 
workers, pai titular!) those using machines, tend to make scores approxi¬ 
mated one sigma above the average employed adult. Men who must use 
their hands skilllully in certain ciafts and professions (manual-training 
teaching, dialling, and ornamental iton woik) stand approximately as 
high. Women who assemble small objects (electric meters and instru¬ 
ments) also excel. On the other hand, skilled workers to whom technical 
information and understanding are more important than manual pre¬ 
cision (garage mechanics), and assembly workers and operatives whose 
operations are gross in nature, score no better than the average worker. 
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It would be highly desirable to compare the means of the USES, 
Blum, and other more recent and more relevant occupational groups with 
these, but unfortunately this is made impossible by differences in the 
scoring methods or by doubt concerning the scoring methods, combined 
with mean scores which seem quite out of line with MESRI norms (e.g. 
Blum’s mean Finger-Dexterity time of 417 seconds for successiul women 
watch assemblers compared to the mean of 244 seconds for MESRI adult 
women). Only one comparison seems clearly legitimate, that between 
Harris’ dental students (341) and the MESRI norms. This shows that 
the former group stood at the 8jth percentile on linger Dexterity and 
at the 8yth on Tweezer Dexterity, higher than any ol the occupational 
groups lor which norms were obtained by the Minnesota project. Such 
a vindication of the clinical judgment oi many users of and writeis about 
a test is, unfortunately, all too rare. 

Job satisfaction has not been related to the O’Connor dexterity tests 
in any published studies. Presumably the tendency is to focus on the 
worker’s need to make a li\ing and on the employer’s desire for efh 
ciency, rather than on the mutual need for emotionally adjusted citi/ens 
who find satisfaction in their wotk. 

Use of the O'Connor linger and Tweezer Dexterity Tests in Counsel¬ 
ing and Selection. The studies which have been made with the* O'Con 
nor dexterity tests have, like the original investigation in which the*\ 
were used, been concerned almost exclusively with their use in the 
selection of vocational or professional students and employees. While 
data from such sources are not only valuable but essential for tests wine h 
are to be used in vocational and educational counseling, they air not 
sufficient. We have repeatedly seen that one must also have inhumation 
concerning the development and maturation of the aptitude 01 tiait in 
question, in order to be able to apply the test to adolescents, and that 
information must be available which in other wavs tlnows light on the 
nature of the characteristic being measured. For the O’Connor dextciitv 
tests fairly adequate data are available to help understand the nut me of 
the trait : it is distinct lrom others which we are able to measure 1 , and it 
plays a part in certain types of vocational activities (summarized below). 
But little is known specifically about its development and maturation, 
apart from the fact that such aptitudes generally mature earlier than 
intellectual traits. This means that caution is necessary in interpreting 
the test scores of adolescents, although those of 17 and 18-year-olds can 
probably be used with some assurance of stability. 
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In general, experience with the tests suggests that wrist-and-finger 
dexterity is likely to be important during the period of initial adjust¬ 
ment to fine Tnanual jobs, and that it is likely to be related to success on 
the job when people with approximately equal amounts of technical 
understand nig 01 trade knowledge are being compared. When the latter 
vary considerably among applicants or employees, differences in them 
arc likely to outweigh the importance of differences in finger dexterity. 

Commenting on the earlier studies of tests of manual dexterity, Wit- 
tenborn (cjgy) has pointed out that the common failure of such tests to 
prove valid probably lies in the nature of the criteria that have been 
employed. He states: 

“Most ol the criteria which have been employed in the prediction of 
mechanical ability ha\e been woik samples prepared under unusual 
competition and other atypical conditions which appear to call lor a 
much higher oidei of spatial visualizing judgment than manipulative 
ability, e.g., the criteria used in the Minnesota study (of mechanical 
abilities). 1 he so-called motor aspects of mechanical ability cannot be 
assumed to be of limited significance simply because their significance 
has not been ngoioush dcmoiistiatcd b\ suitable studies. If imestigators 
emploxrd stub cineiia as satisfaction in work, duration of employment 
in loutine operations, speed of work, quality of specific operations, piece 
woi k output, bicahagc, fatigability and other factors ... it mignt well 
be demonstiated that the motor abilities, particularly manipulative 
abilitv could . . . be granted a significant role in guidance and selec¬ 
tion procedures.” 

Although his paper was written after the publication of most of the 
studies 1 evicted in this chapter, Wittenborn apparently based his l e- 
maiks almost crunch on the Minnesota mechanical abilities study, for 
while the gist of his lemaiks is true, some studies ha\e been made which 
conform to his suggestions. We ha\e seen that some craftsmen whose 
work lequiies manual piecision and probably some interest in using 
one’s hands excel in fine manual dexter it\ (ornamental iron woi kers, 
manual-training teachers, chaftsmen, and dentists) while others whose 
work jecjuiies trade knowledge and insight but no special manual skill 
(garage mec hanics) do not. We have seen that watch assembly woi kers 
who stand high in fine manual dexterity tend to keep their jobs longer 
and to produce more than do those who make lower scores on the 
O’Connor tests. We have seen that those whose fine manual skills impress 
a psychometrist as above average tend to be rated as better woi kers by 
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their foremen. But they are not important in gross manual work such as 
packing. Wittenborn’s insights were excellent, although the state of 
research was not as lamentable as he thought it. 

Although Wittenborn was correct in claiming that (line) manual dex¬ 
terity is impoitant in some mechanical occupations, its primary impor¬ 
tance lies in certain types of semiskilled jobs. The principal reason lor 
the apparent uselessness of tests of manual dexterity in guidance and 
selection lay, not so much in the criteria taken by themselves, as in the 
types of jobs which were first studied by means of manual dexterity tests, 
e.g., those in the MLSRI norm group. Other studies discussed in this 
section have shown that fine manual dexterity is important in simple 
manual jobs which reejuire rapid wrist-and-finger movements, e.g., power- 
sewing-machine operation and the assembly of small electtical paits; in 
more complex assembly work requiring both speed and piecision. e.g., 
watch assembly; and in other occupations in which rapid manipulation 
of small objects such as office machines, cash, and the like ate involved, 
e.g., office machine operator, bank teller, and typist. 

The O’Connor dexterity tests can therefore make* a c unit ibution to 
diagnostic and prognostic work in high schools and colleges, at least for 
students in their late teens and above. In such woik they are helpful with 
students who are considering entering or pupating lor piofessional, 
mechanical, or olltce work in which skill with the hands is impoitant. 
and with others who may enter types ol semiskilled lac ton woik in 
which speed or precision of wiist-and-fmger movements is ielated to 
stability of employment, earnings, and, piobably, satisfaction. 

In guidance centos the tests are useful lot the same pm poses, and have- 
additional value in emplovment counseling when initial adjustments 
are likely to be important: Steel and otheis demonstrated this with their 
electrical woiksample, and Blum with the watch assemblers who lemuincd 
on the job for less than a week. 

In business and industry , the Linger and Twce/er Dexteiity Tests aie 
most useful in the selection of persons who will adapt themselves most 
readily to speedy or precise semiskilled work. They have little to contrib¬ 
ute to the selection of skilled, clerical, and professional workers, as those 
who have completed appropriate training and chosen to continue in the 
field are likely to be above the critical minimum needed in such occupa¬ 
tions. 

Since there are two O’Connor wrist-and-fmger dexterity tests it is in 
order to ask, finally, whether both should be used or only one will do, 
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and if so, which one. In heterogeneous groups , and when rough screening 
is the object ivc, one of the tests suffices because of the substantial correla¬ 
tion in such groups. Normally the Finger Dexterity Test is to he recom¬ 
mended as a measure of a more commonly used degree of dexterity, but 
in some situations the Twee/er Test will be more appropriate. The Finger 
Test also has the advantage of having been more thoroughly studied. 
In homogeneous groups, and when more refined judgments need to be 
made concerning manual skill, both tests should normally be used, 
although local norms and validities will sometimes make possible the 
omission of one test. In any case, it will generally be wise also to use a test 
of gross manual dexterity, such as the Minnesota, in testing for counsel¬ 
ing; in selection testing both should be used in the research stages, drop¬ 
ping the test or tests which prove not to have predictive \alue in the local 
situation. 

The Purdue Pegboard (Science Research Associates, 1943) 

I he Purdue Pegboard was developed by the Purdue Research Founda¬ 
tion, Purdue I’nivcisity, and published in icjpj as a test of two tvpes of 
manual dexteiitv: atm-and-hand dexterity of a finer type than the Minne¬ 
sota Test, and finger dexteiity manifested in a more realistic way than 
in the O’Connot tests. Although still new and relatively little studied, 
both motion study of the test and preliminary data suggest that it merits 
det.tiled consideration. As pointed out early in this chapter, it appears 
to tap abilitv to perform global movements and to eliminate non-essential 
operations to a degree gieater than other manual dexterity tests. 

Applu ability. The Pegboaid was designed as a group test for and 
standardized upon adult industiial workers. It has since been standard¬ 
ized upon veterans counseled in guidance centers and upon college 
students, but, like other manual dexteiitv tests, its development through 
adolescence to adulthood has not been studied. As dexterities generally 
mature early, it is probably sale to use the adult norms with older high 
school boys and girls. 

Content. The Purdue Pegboard consists of a 12X18 inch rectangular 
board with four shallow cups of trays at one end, and two rows of 1 $ inch 
holes perpendicularly down the middle. Fifty easily fitting metal pith 
ate provided, together with 20 metal collars and .]0 metal washeis made 
to fit the pins. 

Administration and Scoring. The test is administered with the subject 
seated at a 30-inch table on which the board is placed with the cups away 
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lrom the subject. If the psychometrist sits opposite the subject, lie must 
be careful not to let his own hands get near enough to the cups to seem to 
interfere with the testing. The first part tests the right hand, putting the 
pins in the holes one at a time; the second repeats with the left hand; the 
third tests both hand simultaneously; the fourth score consists of the first 
three combined; and a final sequence consists of assembling pin, washer, 
collar, and washer using right, left, right, and left hands. Thus dexterity 
is tested for each arm and hand, with fingers playing a simple grasping 
role; ability to perform the same operation with both hands simultan¬ 
eously is measuied; and ability to perform different operations in a 
co-ordinated way with the two hands simultaneously is assessed. As Cohen 
and Strauss (16i>) point out, if the worker can effectively merge the two 
sets of operations in a task such as the assembly test he saves time in the 
total task; if he must work first with one hand and then with the othei, 
he adds to the time required. '1 he assembly test also seems to require liner 
finger movements than the other parts, which appear to resemble the 
O’Connor tests. The score is the number of pins placed in 30 seconds 
(sequences 1 to $) and the number of assemblies made in bo seconds. 

Xorrns. The revised one-trial norms (19 ]8 Manual) ate for .] 1 ;;S women 
applicants for factory employment, 392 college women, 2 p;<) college men 
and veterans, and 865 male industrial applicants, treated sepaiatelv, but 
the numbers are not given in the manual as finallv printed. Three-trial 
norms are based on data from 500 college students which made possible 
the extrapolation of norms for all groups. Analysis of data for 900 sub¬ 
jects by previous employment, regional origin, and race failed to icveal 
any group differences. But norms lor veterans published by Long and 
Hill (- 179 ) tend to be somewhat lower, particularly the total scores. 
Although these norms are helpful for general intet pretation, thev throw 
no light on the vocational significance of the test scoies. Occupational, 
especially semiskilled, norms are badly needed. 

Standarchzation. The test authors stated in the original manual that 
considerable data had been gat hoed concerning the test’s validity, but 
that government (wartime) regulations made impossible theii publica¬ 
tion. They added that comparable studies were being made elsewhere, 
results of which were made available in the revised manual and are 
described below, under validity. 

Nothing is said, in the manual, concerning the process of developing 
the test. Reliability data are given: .71 for the total score of the combined 
pin-placing tests, and A )8 lot the assembly test, one trial each (\ = 17^ to 
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lTl)- Three-trial reliabilities are estimated to be .88 and .8G. To this 
writer these data surest that the board should be modified to piovide 
three rows ol holes at each side of the board, more pins, washers, and 
collars, and 90 seconds ol working time fot each of the pin-placing tests 
rather than 30 seconds. This would not unduly lengthen the test and 
would gi\e it a reliability more in line with modern standauls. 

Reliability. As indicated above, the reliability ol the standard one- 
trial test leaves something to be desired. Surgent (807) has con firmed the 
test author s data with a gioup ol 233 women lactoiv workers. 

Validity. '] he test being cjuite new, only one field validation study has 
as yet been published (S07). Theie will undoubtedly be a number beloie 
this book has been long oil press. 
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I able if) gives the icsults ol the validity studies repoi ted in the manual, 
by pel mission ol Science Reseal th Associates. It should be noted that 
numbeis weie \er\ small (and the t’s therelore not very reliable) in all 
gioups except the last. I ; or this gioup ol 233 radio tube mounters, with 
ratings as a criterion, the 1 validity of the three-trial assembly test was .tip 
The trend of the other collections is encouraging but more adeejuate 
data are death needed. 

It should be noted that the suggestions for interpretation on the Scote 
Sheet provided bv the- publisher include ai lists, chaulleurs, mechanics, 
musicians, pilots, and otheis, as well as assembly woikers, as groups lor 
which the test should piove useful. But no data support these claims, 
and while pilots, at least, might conceivably make high assembly (co-01- 
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(Iination) scores, there definitely is no relationship between manual 

dexterity and success in Hying (ui.j). 

Use of the Purdue Pegboard m Counseling and Selection . Until Lu¬ 
ther validation data are provided, there is only one kind of situation in 
which this test can now be used lor counseling: that in which the coun¬ 
selor, or a psychometrist who writes detailed test reports, has a lust-hand 
knowledge of factory jobs accjuired by job-analysis experience. Such a 
user of the test may obtain from it clinical insights into the manual 
dexteiities of his clients, which he then subjectively translates into occu¬ 
pational terms. Unless this tianslation is based on intensive job-anaivsis 
information it is likely to be dangerously misleading. The observer will 
want to look foi efficient use of hands, particularly for global cooicli- 
nated movements in the assembly test. I he nature of the test is such that 
this writer is confident that specific occupational norms and good validity 
data can be made available in due course. 

In selection , the test may similarly be used in situations in which 
decisions have to be made befote validation and local noiming can be 
completed. Again, job analysis data aie needed. On the othei hand, it is 
possible and wise, more frequently than most so-called practical men 
admit, to make immediate decisions on other bases, and to use tests at 
fust only to gather reseatch data which will provide a better basis (or 
similar decisions as the need rectus in the future. 11 teseaich data are not 
gatheied the first time, but the tests are put to intuitive use*, then judg¬ 
mental errors, comparable to those which the tests weie adopted to do 
away with, are perpetuated. One tvpe of intuition replaces another. 

The Purdue Pegboard, modified as suggested in the discussion of re¬ 
liability, seems to be an extremely promising test for assembly, packing, 
machine-operation, and other fairly precise manual jobs. The analysis of 
manual work by Cohen and Strauss, discussed in the opening section of 
this chapter, and the nature and validity of othei manual and finger 
dexterity tests, suggest this. It should be valid for a greater and manually* 
moic demanding variety of jobs than the Minnesota Rate of Manipulation 
Test, and should have higher validities than the O’Connor dexterity tests 
for jobs such as those lot which these have proved valid. P>ut evidence 
should be assembled and published. 
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MECHANICAL APTITUDE 

Nature and Role 

THE I 1 J EE ol this chapter, and indeed the writing of a separate 
(haptei on this subject, ate a concession to practical considerations and 
to popular usage, rather than an organization of materials dictated by 
the natine of aptitudes. Counselors, personnel men, and vocational 
psychologists have long been accustomed to thinking in terms of mechan¬ 
ical aptitude. They have not defined the term in any strict sense, but have 
used it operationally to refer to the characteristic or set of characte ristics 
which tends to make- for success in mechanical work. Tests have been de¬ 
veloped which have proved to be reasonably valid for various tvpes of 
mechanical occupations. In one sense, then, there has been some justifica¬ 
tion for using the term mechanical aptitude. But while these practical de¬ 
velopments weie taking place psychologists were also stuching mechanical 
aptitude in order to ascertain whether it was in fact one trait or aptitude 
in the limited sense of the term, or whether it was really a combination 
of aptit tides. 

llie fust significant attempts to study, rather than simply measure, 
mechanical aptitude weie can ied out by Cox (175) in England and b\ 
Patel son and associates (ySS) at the University of Minnesota. Using 
especiallv constituted mechanical apparatus which did not lend itsell 
well to scoting, Cox applied fac tor analysis to Iris data according to Spear¬ 
man's two-factor method. He isolated a factor which seemed to be of 
special impoitance in the mechanical tasks, and therefore might be called 
“mechanical aptitude”; but it was an eductive factor of the spatial 
telations type, rather than something peculiarly mechanical which might 
lie called “mechanical comprehension.” 

At about the same time Paterson and his colleagues were carrying out 
the Minnesota Mechanical Abilities Project, in which they first ti ied out 
a number of existing tests, then leviscd and selected from these to make 
a definitive Much of mechanical aptitude in junior high school bovs. 

221 
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As Harvey (350) points out, the Minnesota project was superior in test 
ideas and construction to the Cox, but was somewhat weaker in theory, 
for Cox utilized factor analysis theories and procedures which were not 
vet in use by American psychologists. lie consequently had not only 
superior statistical methods but also some what more clear-cut hypotheses 
to guide him in planning his project. In the Minnesota project the 
Minnesota Mechanical Assembly, Spatial Relations, and Paper Form 
Board Tests weie administered, together with the Otis, an interest 
inventory, and the Stencjuist Mechanical Aptitude or Piet me I ests (re¬ 
sembling the O’Rourke). Data on cultuial status, recreational interests, 
mechanical operations or activities around the home, lather’s mechanical 
operations, tools owned by the subject and by his lather, mechanical 
ability lecjuiied in the lather’s occupation, and similar factors were ob¬ 
tained. The subjects were 150 junior high school boys in Minneapolis. 
Validity aspects ot the stuch will be considered in connection with 
specific tests; at this point our interest is in the nature ol the factors 
measured by the tests which were selected to appraise mechanical apti¬ 
tude. 

Inlotmaiion on this subject comes from studies b\ Harrell (;;;;(>) and 
by Wittenboin pp>r>)* Harrell applied Finn stone’s centroid method ol 
factor analysis to the Minnesota batten, which he had administered to 
()i cotton-mill machine fixers together with more than ;o other tests. 
Five factors emerged, ol which two, perception of detail and \ isuali/ation 
ol space relations, were important in the Minnesota tests. 'Flu* lormer 
was demonstrated by repetitions of the tests to be a routine t\pc ol 
ability, whereas the latter pla\ecl a pan onI\ in the earliei adimnistra 
lions ol the test to a given subject; Harrell therefore described the spatial 
factor as the equivalent ol mechanical ingenuity. Wittenboin applied 
the same lactoiial method to the data ol the original stuch. In this case 
the inter correlations between the- Minnesota Mechanical Assembh lest 
(described later and cited here as the- pmtotvpe ol “mechanical aptitude” 
tests) and the Minnesota Spatial Relations Test and Paper F01 m Board 
were respectively and ..jp. 1 his suggests that spatial visualization 
plays an important part in “mechanical aptitude,” but does not explain 
entirely performance on such a test. Wittenborn isolated four factors, 
of which only one, spatial visualization, played an important part in the 
Mechanical Assembly Test. Spatial visualization accounted for 37 percent 
of the variance in the Assembly Test; this is to he compared with r f r f 
percent ol the variance in the Spatial Relations Test, .\9 in the Paper 
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Form Board, and r,G of ratings of the fjuality of shop woik, showing in 
another way that spatial visualization is important but still only one ol 
the factors which play a part in such instruments as the Minnesota 
Mechanical Assembly Test. 

Neither Harrell’s nor Wittcnborn’s studies raise the question, 01 
throw any light on the nature, of the factor or factors which account 
for the remaining 63 percent of the \ariance of the Minnesota Assembly 
Test. Neither does another anal)sis of Cox’s mechanical assembly tests 
by Slater (720), although the last-named imestigator agreed with the 
others in finding no special mechanical factor over and abene genet a I 
intelligence and spatial visualization. But this inability to isolate an\ 
other factois is in part a function ol the* types and varieties of tests which 
are used in the factor analysis: one can locate only the factors which are 
important in several of the tests, and if a factor is important in only one 
or two tests it may not emerge as significant. 

Bingham (pp Ch. 11) suggests that factors in mechanical success 
ate mechanical aptitude, measured by tests such as the Minnesota 
Assembly and Spatial Relations 'J ests, manual dexterity (demonstrated 
to he unimportant), perceptual acuity (confmned), and mechanical in- 
fot mation. I he Minnesota study included a measure of mechanical 
information (“the shop operations information criteiion ) which had a 
con 1 Iation of ep, with the Assembly lest, but this item was omitted in 
With uborn’s analysis, and nothing compaiable to it was included in 
Han ell’s data. Both authors included the Stcnquist Picture Tests illicit 
arc* generally thought to measuie mechanical information and which mi - 
1 elate . jo and . jf> with the Minnesota Mechanical Assembly Test. Al¬ 
though only 22 and 18 percent of the variance in the Stenquist tests is 
accounted for by the spatial factor (ci;p>), and perceptual speed and ac¬ 
curacy plays some part in them (336), thev are \irtuallv unanalv/ed b\ 
Wittenborn and Harrell’s studies. 

Guilford’s analysis of a greater variety ol tests tried out in the Arm\ 
Air Forces’ Aviation JNychoIogy Program (p> 1 17) prov ides the answci 

to the question of what othei factors play a part in tests of mechanical 
aptitude. In this analysis, thanks to the inclusion of a test of mechani¬ 
cal inhumation, an aptitude test patterned after the Bennett Mechanical 
Comprehension Test (described below) was found to be heavily sat mated 
with two factors: spatial visualization and mechanical inhumation. 

What has commonly been thought of as mechanical aptitude, vhat 
vocational psychologists have for twenty years known to be partly spatial 
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visualization, and what some authorities (ej.j) erroneously thought lo be 
also partly manual dexterity, finally emerges in Harrell s and Omlfords 
studies as a composite of spatial visualization, perceptual speed and 
acuity, and mechanical information. As in the case of I 5 inet s global 
approach to the problem of measuring intelligence, this lumping to¬ 
gether of several aptitudes in one test has had its advantages, for in clays 
when factor analysis was in its infancy reliable and valid tests were 
developed, effective even though impure, for the prediction of sin cess 
in mechanical activities. With the information and techniques now 
available purer tests can be developed which will result in a better un¬ 
derstanding of both aptitudes and activities, and which will be more 
versatile in their applicability. There is room for doubt as to whether 
hey will be more valid for all purposes, even when combined in bat¬ 
teries, because of the advantages of face validitv and the 1 inclusion of 
specific factors w’hirh chaiactcri/e factorial]) impure tests depending 
heavily on job analysis and job content for their items. In the* meantime 
multi-factorial tests of so-called mechanical aptitude or comprehension 
are among the most \alid tests available. For this reason the \ ate dealt 
with as such in this chapter, and purer tests of spatial \ isuali/atmn ate* 
treated separate!) in the next, just as tc\sts of manual clextcniv wear 
taken up in the preceding chapter. 

Sr a ear ic: I'ists 

One of the earliest tests of mechanical aptitude was the' Stencpiist 
Mechanical Assembly Test (7 r } r t ), consisting of a long narrow' box, each 
compartment of which contained a mechanical connivance to be- as¬ 
sembled by the examinee. The ten items consisted of a mouse trap, a 
push button, and similar everyday objects. Stencpiist also developed two 
picture tests designed to measure tire same type of aptitude, but, since 
manipulation and trial of the pails is impossible in a primed test, it has 
generally been thought of as being more heavily saturated with informa¬ 
tion than the apparatus tests. As a result of work with Army trade and 
mechanical aptitude tests during World War I, O’Rourke (277:2^) 
developed a graphic and verbal test of the same type. Paterson and 
associates (588) modified and lengthened Stenquist’s Assemblv Test as the* 
Minnesota Mechanical Assembly Test for their intensive study of the 
nature and measure ment of mechanical aptitude. More mentis, Hrrmett 
(T8) developed his lest of Mechanical Comprehension in order to tap 
a higher level of mechanical aptitude than the Stencpiist, O’Rourke, and 
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other papii-aml-pcnul tests already available. A totally different type ol 
<oinposite test was co.istrnrted by MaeQuarrie (yo.j), who combined sole 
tests ol spatial visualization and manual dexterity in a lest of so-called 
mechanical aptitude. 

Of these and other tests like them, the Minnesota Mechanical Assembly 
Test, the O'Rourke Mechanical Aptitude Test, the Bennett Mechanical 
Comprehension Test, and the Mac (.Ramie Test of Mechanical Ahihty 
ha\e been selected (or detailed tieatment. The assembly test lias been 
diosen as the most adequate ol its t\pe and because of the insights which 
studies using it give into the natuie and organization of aptitude ior 
mechanical woik, e\en though it is no longer widely used. The O’Rourke 
has been as thoroughly studied as the Stencjuist and other picture test*,, 
and has the advantage of moie tecent and more extensive norms than 
most, it is still widelv used, although theie is loom for a well-constructed 
and up-to-date test ol the same t\pe. The* liennett is one of the newest 
but most ihotoughh studied and widelv used giaphic tests of mechanical 
aptitude, and taps a highci l(*\c*l of aptitude than the other mechanical 
aptitude tests. \ncl the MacOuairie is not onlv unique as to content, but 
widelv used and studied, although it could just as well be dealt with 
under tests ol manual dexteiitv or spatial visualization, it is included in 
this chaptei as a composite test of mechanical aptitude. The Pm due 
Mechanic nl Adaptability Test is also tieated, more briefly, as a new 
mstiumc nt of some promise. 

The Minnesota Mechanic al Assembly Test (Marietta Apparatus C.o.. 

>b:;o) 

This test was developed as a part of the University of Minnesota’s 
stuclv of mechanical aptitude, in the preliminary work of which it was 
found that Stemjuist's ten-item test had a reliability of onlv .yi>. 1 luee 
boxes, 01 a total ol mechanical items, were used with a resulting 
leliability of .cjo. Tliice of these items have since been omitted, making 
a total of The Stencjuist test having been one of the first fairlv good 
tests of mechanical aptitude, and the Minnesota being a demonstrated 
improvement uj>on that, the latter came rapidly into widespread use in 
clinics and guidance bureaus doing individual testing with adolescent 
boys; it has not been so extensively used in other situations, because of 
administration time, wear and tear, and the effects of experience. 

Applicability. Like the Stencjuist, the Minnesota Mechanical Assemblv 
Test was designed for use with junior high school boys, and particularly 
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for the prediction of success in shop courses. It was recognized that 
experience or familiarity with mechanical objects might well play an 
important part in scores on such a test, even at this age; the Minnesota 
study therefore analyzed the relationship between a number of environ¬ 
mental factors which reflect or constitute differences in experience, either 
direct or vicarious, with mechanical objects and processes. Two experi¬ 
ence items showed positive correlations with the assembly test: recrea¬ 
tional interests (.23) and mechanical household tasks such as electrical 
repairs performed by the boy (.40); on the other hand, ratings of the 
mechanical ability required by the father’s occupation, the tools owned 
by the bov, and the tools owned by the father, had no relationship with 
the assembly test scores of the 150 boys of the study (r’s = —.11, .i.j, and 
.03). Two other relationships are of interest here, one being that with 
age which is understandably negligible (13) in a group as relatixely 
homogeneous as 7th and 8th giade boys, and the other that with scenes 
on a test of shop information which is model ately high (.3-,). It is note¬ 
worthy that the three experience items with which substantial correla¬ 
tions were found probably invoke both cause* and elite t: bens with mot e 
mechanical aptitude could be expected to choose mechanical hobbies, 
seek to do household repairs, and learn a good deal about shop processes; 
at the same time, bens who ha\e such hobbies, perfoim such chores, and 
learn well in shop courses could be expected to acquire the knowledge to 
do better than others on a test of mechanical assembly. On the other 
hand, the items which are more strictly environmental, i.c., not within 
the control of the boys hut aliening them nonetheless, show negligible 
relationships nith assembly test scores; bo\s do not choose their lathers’ 
occupations nor decide how many tools their fathers will have, and 
economic factors and parental ideas piobabh determine the boys’ own 
tools more than do their desires, but one would expect mechanically 
inclined fathers who hu\c and use their own tools to have some eilect on 
the mechanical information possessed b\ bo\s in their carls teens. More 
important, perhaps, than nrere possession of mechanical tools and 
hobbies by the father may be the extent of identification of the son with 
the father and of father acceptance of the son. If this is so, the continua 
are not experience vs. 110-cxpericnce, but mechanical-father-idemific a- 
tion, and non-mcchanical-father-rejection, each of which must be com¬ 
bined with son-acceptance and son-rejection in order to describe the 
emotional as well as material environment which shapes the boy's in¬ 
terests and information. Unfortunately, no such refined studies have as 
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yet been attempted. I hat the mechanical activities of the fathers do not 
affect the sons seems to indicate that at this age the Minnesota Mechani¬ 
cal Assembly 1 est is more a measure of differences in mechanical insight 
(spatial visualization) than ol mechanical information. 

Perhaps this is why Wittenborn’s factor analysis (935), cited in the 
opening section of this chapter, and using the original Minnesota data, 
bailed to isolate any other important factors in this test. Harrell ( c ;-U) 
reported a correlation of — .22 between inexperience and assembly test 
scores in his study ol mechanical aptitudes in adult cotton-mill machine 
fixers. Adults who had had mechanical experience did better than those 
who lacked it (I Ian ell also showed that practice on the assembly test 
1 educed it to a measme ol perceptual speed and accuracy). We base 
already seen that Gudimd (31 (>/} 17) found an experience factor in an¬ 
other mechanical comprehension test used with aviation cadets. These 
data lead to the conclusion that in early adolescence tests such as the 
Minnesota Mechanical Assembly Test are primal dy measures of mechani¬ 
cal comprehension (spatial \isuali/ation). whereas in late adolescence and 
adulthood thev also tap mechanical information (experience). 

Clinical experience with the assembly test has led to the generally 
accepted conclusion that it is unsuitable and too easv for use with older 
adolescents and adult men. and too difficult for most women. The first 
is pel haps verified b\ the AAE study (^lb) cited above, but none of them 
have actually been objectively confnmed with the assembly test itself 
except through data on age differences in the reliability ol the test (see 
below). The most objective evidence 1 , apart from Harrell’s data, lies in 
the nouns for vat ions age and occupational groups, which show increas¬ 
ingly higher scenes from age 1 2 to age 19 (raw scores of 2^2 to 299, the 
former median being at approximately the rotlr percentile for 19-year- 
olds). Hut the- available data do not tell us whether these increases with 
age are the result ol matination ol spatial visualization or of increased 
familial ity with mechanical objec ts. As manual training teachers and 
ornamental ironworkers in the Minnesota Employment Stabilization 
Research Institute' fell midway between tire average 18- and 19 Vear-old 
boy in the original norms, auto mechanics weie slightly lower, and the 
average emploved adult was little more than midway between the average 
17 and 18-Year-old. the implication is that either the sampling in the 
adolescent group was skewed toward the upper limits or maturation of 
spatial visualization plavs a greater part in assembly test scores than 
experience in mechanical activities. If this were not so miscellaneous 
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boys would not sin pass skilled mechanical workers. It seems more prob¬ 
able that the adolescent sample is not adequate at the upper limits (due 
to elimination in high school) and that skilled workers surpass 19-year- 
olds about as much as they do 17-year-olds, that is, by more than one 
sigma. Lacking adequate objective evidence concerning the' cflects of 
experience on the assembly test scores of adults it seems wise for practical 
purposes to agree with Bingham (91:308) and with Paterson, Schneidlcr 
and Williamson (590:222) that the varied amounts of mechanical experi¬ 
ence which characterize adults make it unwise to use this test with that 
age group; the theoretical question remains open until better evidence 
is accumulated. 

Content. The Minnesota Mechanical Assembly lest consists ol three 
boxes containing 33 mechanical objects such as an expansion nut, a 
hose-pinch damp, a wooden clothespin, a push-button door bell, a spaik 
plug, an inside caliper, and a petcock. 

Administration and Scoring. A fixed amount of time is allowed lor 
woi k on each object, these being presented unassembled in theit com¬ 
partments. Scoring is on the basis of proportion ol possible connections 
made in the allotted time. T he psvc hometrist needs to be thoioughl) 
familiar with the assembly and disassembh of the* objects, both horn 
stuching the directions and from actual!) practicing with the mate 
rials, especially the latter. He must know not only how to put the parts 
together, but what condition they should be* in when new, for mam 
boxes ac tually in use contain be nt or bioken parts and non-standai c! 
replacements which change the nature of the task. In fact, one pioblem 
brought out by World War II testing operations, and not adequatch 
reali/ed when imestigations such as the Uni\ersity of Minnesota's study 
of 150 boys was planned, is the drastic effect on apparatus tests of the 
wear and tear of large-scale testing. In the Air Foice ptogtam, for ex¬ 
ample, it was found necessary to assign an officer and seuial enlisted 
men to an apparatus conttol unit at each testing center, theii sole' 
function being to make statistical studies ol the cflects ol dilleiences in 
supposedl) identical pieces ol apparatus on test scoies and to establish 
correction formulas for nnv scoies on each apparatus. Most of these 
differences were due to wear and tear through use, as man) as 100 men 
per clay being tested by a given piece ol equipment. 

Xonns. Norms for boys aged 11 to 21 were published by Paterson and 
associates (588) as a lesult of the Minnesota Mechanical Abilities Project, 
and lor general adults and specific occupations by Green and others 
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(30G) after the test was used in the Minnesota Employment Stabilization 
Research Institute. Pateison does not make clear the number ol cases 
used in the original nouns, which included at least 150 boys in 7th and 
8th grades, but unknown numbers at the higher levels. Since the test is 
most uselul at the junior high school level this is not a serious limitation. 
The adult norms ate based on the Minnesota standaid sample ol 500 
employed adults; the specific occupational groups ate small, ranging 
from 18 draftsmen to ibp manual-ti aining teachers. In view ol the 
ptesumed eflects of expeiience and the suitability of other tests for adult 
use, the adult norms ate ol cjuestionable value; they do show the ex¬ 
pected gtoup clifleiences, as will he seen below, but these ate not as gteai 
as one would expect in a good aptitude test, pet haps because ol the 
leveling eflects of experience and inhumation v\ith items such as these. 

Standai dization and Initial Validation. As has alteady been indi¬ 
cated, the Minnesota Mechanical Assembly 1 est was developed as a mote 
leiiable edition ol Steinjuist’s test. As a part ol the intensive studv of 
mechanical abilities earned out b\ Pateison and associates (7SS) it was 
c 01 (elated with a v.uietv ol othei tests and with a number of expe rience 
vaiiables in older to ilnou light on its natute and validitv. Some ol these 
have aheadv been discussed, in connection with the question of the ap- 
plicabilitv ol the test, others lemam to be considered. 

In the iclativelv lestnctcM age, but somewhat gteater intellectual, 
i.mge of the 7th and Sih glades the con elation betwe en assembly te st 
sc oi< s and Otis I (). was .ob. Spatial visualization as measuied bv the 
Minnesota Spatial Relations and Papei Form Board bests, on the other 
hand, had c 011 elal ions ot .;,ti and . p) lespec tiv elv. showing the impoi taut 
1 ole ot the spatial lac tor in mechanical assembly woik at this age. em¬ 
ulations with the Stc liejuist Putine bests were .pi and .pi, as might be 
anticipated with papei and-peiic il tests ol mechanical comprehension. 

I belt* was no 1 c hitloiislnp between assembly test score and average 
academic giades p .1;;), but the conelation with latings of the cjualitv 
of shop operations v\as .r,;,. and that with a test of shop information was 
.gV 'I he higher conelation with opei at ions, as opposed to information, 
suggests that the test was accomplishing its objective of measuring apti¬ 
tudes fo 1 mechanical woik. Certainly it piedieted success in that much 
better than in academic' woik. 

Two other 1 elationships aie oi interest, one- a conelation ol .oj with 
pieleicncc io, mechanical occupations, the- othei a conelation ol 
with scenes on a mechanical inteiest inventoiv. I he disc repaiu v suggests 
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that the expressed occupational preferences of junior high school boys 
may not be valid indicators of inteiest, whereas inventory scores may be, 
conclusion confirmed by other studies reviewed in the chapter on inter¬ 
ests. In view of the confirmation of this deduction, it may be concluded 
that mechanical interests and mechanical aptitude tend to be associated, 
although the relationship is far from perfect. Pei haps the relationship is 
due to the role of interest in the acquisition of information, and the role 
of information in so-called mechanical aptitude. 

Reliability. In the original study of the Minnesota Mechanical As¬ 
sembly Test its reliability was found to be ajj when computed by the 
odd-even method and collected by the Spearman-Brown formula, based 
upon 217 junior high school boys (588). I11 the MESRI project the 
corrected odd-even 1 cliability was only .79 for .j p] adult men, and .(>8 
for 127 adult women (187), the difference presumably being due to the 
effects of experience in adolescence and adulthood. Brush (122) found 
a collected reliability of .65 with engineering freshmen. For this reason 
only extremeh high and extremely low scores ate likely to have anv 
significance lor adults. In another study using deal children as subjects, 
Stanton (7 pj) found retest reliabilities ol .71 lor boys (X — r,7) and .(’>0 
lor girls (X = y(i) alter a period ol two years; in \iew of the* probability 
of experience with mechanical objects at that age, and ol the known 
effects ol matination on spatial visualization, these may be taken as 
not out of line with the other report based on children. 

Validity. The Minnesota Mechanical Assembly Test was correlated 
with intelligence tests in the MESRI project (3off), where with adult 
subjects and the Pressey Classification and Verification Tests the coeffi 
cients ranged from .10 to .2(», and by Super in an unpublished study with 
the Otis and NYA youth, in which the correlation was .2 \. W hile these 
coefficients are slightly higher than those reported in the original work 
with the test, they are low enough to be negligible. 

No published data on the c on elation between widely used manual 
dexterity tests and assembly test scores have been located, but several less 
used tests in the Minnesota battery yielded low or negligible correlations. 
In his factor analysis of these data, Wittenborn (935) found that manual 
dexterity did not have an appreciable loading in the assembly test, and 
Harrell (336), using the same tests and new subjects, confirmed the 
absence of a manual dexterity factor in this test. In an unpublished study 
of 50 junior high school boys the writer lound a correlation of only .05 
between assembly test and Minnesota Placing Test scores. Apparently 
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manual dexterity is subordinate to other factors in the task of assembling 
mechanical objects such as those in the Minnesota test. 

Although the assembly test was correlated with the Stenquist Picture 
I ests in the original study, it has apparently not been related to other 
tests ol mechanical com prehension , except in an unpublished SLudy by 
the writer, in which it and the O’Rourke Mechanical Aptitude Test weie 
administered to fifty junior high school boys with a resulting correlation 
of .Or,. I his is higher than that of ..jO reported with the Stenquist in the 
original study and confirmed by Harrell (331) with adults, although the 
writer used a similar group of subjects; it is not, however, contrary to 
what one might expect in apparatus and paper-and-pencil tests designed 
to measure the same type of aptitude. Perhaps it indicates that the 
O’Rourke more closely approximates a graphic version of the Minnesota 
than does the Stenquist. 

The important role of spatial visualization in mechanical assembly 
tests seems to have also been accepted virtually unchecked as a result of 
the Minnesota pioject, in which the correlation was .56 between assembly 
test and Minnesota Spatial Relations Test. In the unpublished study of 
junior high school boys teferred to above the writer found the expected 
correlation of ..jS between the assembly test and the Revised Minnesota 
Paper Form Roaid, but one of only .25 with the Minnesota Spatial Rela¬ 
tions I est. In \iew of the other data, this may be only a chance lack of 
1 elationship, which might prove to be higher in other similar samples of 
the same population. Hairell (33]) leported a somewhat higher correke 
tion of .35 between Minnesota assembly and spatial relations tests ad- 
ministeied to adult factory workers. However, the results of his factoi 
analvsis agreed with Wittenborn’s (935) in describing spatial visuali/a- 
tion as the principal factor in the assembly test, and Tiedick (869) found 
its highest correlation among Thurstone’s VMA Tests to be with the 
spatial factor (.3 j; Reasoning was .30. Induction .2<h Perception .23V 

The correlation between mechanical assembly test and mechanical 
intnest imentorv scores, leported as .yz for junior high school bovs b\ 
the original stuck, was found to be onlv .10 when the same tests (Min¬ 
nesota Assemble and Minnesota Interest Analysis) were used with adults 
bv Harrell (33 [). Whether this is a result of the effects of experience on 
the test scores, giving them different meaning for adults, or a direct 
contradiction of the Minnesota findings is not shown by the data; it 
seems likely that it is to be explained bv age differences in experience 
and its effects on assemblv scores. 



APPRAISING VOCATIONAL FITNESS 


Grades have been list'd as a criterion by Stanton (719). Tredick (Hfxj) 
and Brush (122). Stanton administered Minnesota Battery A (Assembly, 
Spatial Relations, and Paper Form Board) to 121 deaf hoys aged 12 to 
11. The battery validity was .50, and that for the assembly test was .12, 
using amount of time spent in shop work as a criterion. T his finding is 
not as favorable as the original .55 repotted by the test authois, and the 
shrinkage seems greater than that normally found between first and 
subsequent validities, but this may be due to the substitution of a time 
for a quality criterion. 

1 redick’s study imohed 113 freshmen students ol home* economics at 
Pennsylvania State College. She used the Minnesota Mechanical As¬ 
sembly Test together with an extensive batten ol othe r tests. Her c 1 iteria 
were semestei-point average and grades in fust semester courses in art, 
rhemistrv, and English composition. The correlations were rcspccti\ely 
• 11, -17. -ifi. and —.01, none of which are high enough to be ol value. 

Brush administered the assembly test to io.j freshmen engitieeritig 
students at the ITmcrsity of Maine, correlating results with grades for 
the first year and lor all fom years. The two coefficients were .28 and .1*7, 
both of them reliable. Apparently the test has sufficient \alue for the* 
prediction of success in engineeiing training to pistils its inclusion in a 
battery, despite the effects ol experience b\ the* end of high school. In 
view, however, ol the cumbersomeness ol administi ation and scoring, 
and of the high conelations with paper-and pen< i! tests ol the same 
t\pe, it is doubtful whether the incieascd picdiitiv \alue ol a well- 
selected battery would warrant the time and tumble to include it. 

Success on the job has not been used as a oiteiion with the Minnesota 
assembly test, judging by the lack of such leports in the* journals. In 
\ iew r ol its gieater suitability lor use with junior high school students 
than with adults this is pci haps not surprising; it is to be* regretted, 
howe\er, that no follow-ups have been made, to ascertain the* relation¬ 
ship between assembly test scenes in junioj high school and choice ol 
and success in subsequent mechanical cmplo\me*nt. 

Differentiation of occupational gioups by the Minnesota Mechanical 
Assembly Test was demonstrated at the Employment Stabilization Re¬ 
search Institute (223), where machinists scored at the Sot h percentile*, 
manual-training teachers, ornamental it omvorkc is, and gat age mechanics 
at the 68th, and draftsmen at the* (Eqh peuccntiles Weakens in less me 
ehanicaJ occupations such as office cleiks, machine openatots. letail 
salesmen, and policemen, genciallv make scores less than one sigma 



MECHANICAL APTITUDE 233 

above the mean of the general population. These trends are in the 
expected directions, although, as pointed out earlier, the mean scores 
of manual-tiaining teachers and certain other mechanically inclined 
groups are not as much above the mean as one would anticipate, perhaps 
because universal experience with the items in the test tends to minimize 
differences in mechanical comprehension among adults. 

Occupational satisfaction would seem to be logical criterion against 
which to validate mechanical aptitude, on the hypothesis that those who 
are relatively lacking in it would find their work uncongenial and pet 
haps a strain, while those who are relatively high in mechanical compie 
hension would solve new problems and master new techniques readily 
and with zest. As an aptitude which is also somewhat related to interest 
this should perhaps be more true of mechanical comprehension than ol 
most purer “aptitudes.” Despite these facts, no known studies have cot- 
rclated scoies on the Minnesota Mechanical Assembly Test with job 
satisfaction. 

Use of the Minnesota Mechanical Assembly Test in Counseling and 
Selection. The evidence which has been reviewed in the preceding 
paragraphs is mote adequate concerning the standardization and valida¬ 
tion of the Minnesota Assembly Test than are comparable data lor most 
tests, the authors having systematically studied it in a variety of respects. 
Unfortunately it has not been so thoroughly studied since that time, 
despite Wittenborn's and Han ell’s factor analyses. One reason for this 
is the c umbet someness of the test, not only in administration and scor¬ 
ing, but also in maintenance; another is the proved adequacy of papor- 
and-pcncil tests designed to measure the same factors. 

Despite these- defects, the assembly test is useful with early adolescents 
whose significant experiences with mechanical items such as those in the 
test are still laigelv dependent upon aptitude and interest. The effects 
of maturation upon the principal component, spatial visualization, make 
the use ol adult occupational norms impossible with adolescents. The 
leveling effects of experience, suggested by the decreasing reliability co¬ 
efficients with increasing age, further complicate the picture and render 
the scoies of older adolescents and adults difficult to interpret. 

Occupational gioups distinguished by high scores on this test include 
machinists, manual-training teachers, ornamental ironworkers, garage 
mechanics, draftsmen, and presumably other workers in mechanical oc¬ 
cupations, job analysis of which suggests a need for ability to visualize 
space relations and interest in the acquisition of knowledge about the 
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nature and operation of mechanical contrivances. Whether the superior 
scores made on this test by adult workers in these fields are due more 
to aptitude than to experience, or vice versa, does not appear to be im¬ 
portant when early adolescents are being counseled, for at that stage trie 
test is largely a measure of aptitude which appaieutly leads to experience, 
nor is it especially important when selecting adults for a related type of 
work, for in such a case present ability to do the work is important, 
regardless of its basis. It is only when long-term adjustments and ability 
to learn are important that it is necessary to distinguish between experi¬ 
ence and aptitude as causative factors in assembly test scores. 

School and college use of the assembly test has proved feasible, the test 
having predictive value in junior high school, high school, and engineer - 
ing college courses. In \iew of the equally high validities of othei tests, 
and the little added to battery validity by this test at am save 1 the* junior 
high school level, it is doubtful whether the time and trouble required 
for its use aie justified. The lest may be of considerable \alue, however, 
in the clinical study of the aptitudes and experiences of special cases. 

Guidaiue centos and clinics are most likely to find the test \aluable 
in this type of case. When a client’s experience with mechanical objects 
is in need of further studs because of lack of mechanical outlets, or when 
his aptitudes as measured b\ other tests of spatial visualization, mechani¬ 
cal information, and manual dexterities seem out of line with his experi¬ 
ence, then administration of the assembly test b\ a skilled examiner ma\ 
prove fruitful. The ease with which the subject approaches the appa¬ 
ratus, the familiarity displaced by his examining and assembling of them, 
his confidence in his ability to complete the* assemblies in time, his re- 
actions to difficulties and failure, his incidental comments concerning the 
test and related matters during and alter testing, all pio\ide material in 
addition to the actual scoie which a skilled psychologist can piece* to¬ 
gether in order to obtain a truer picture ol the client’s aptitudes, inter¬ 
ests, and experiences. 

Business and industrial use of the Minnesota assembly test is ptobabh 
unwise because of its unreliability with adults, the leveling effects ol 
experience, and difficulties in administration. It is true that it can haw 
some value in indicating present mechanical skill in job applicants, but 
if these are in a skilled category track* tests aie more appiopriatc and 
valid, and if they arc: semiskilled manual and spatial tests will prose- 
more economical and more \ahd. 

In summary, then, the Minnesota Mechanical Assembly lest is im- 



MECHANICAL APTITUDE 235 

portant primarily for historical reasons and for the insight the studies 
with it give into the nature of mechanical aptitude; its practical use is 
limited primarily to the clinical study of special cases, especially in 
adolescence. 

The O' Horn he Mec hann al .latitude lest. Junior Crrade (Psychological 
Institute, 192(1, rgjo) 

The O’Rourke Mechanical Aptitude Test was developed alter World 
War I, as a icsult ol the test authoi’s experience with the Army Mechani- 
cal Aptitude and Army GeneiaJ Trade Tests, incorporating csscntialh 
the same items 77:(>5 11.). According to Fryer, the original wotk t>\ 
Rice was carried further by O’Rouike and loops in the Army, and the* 
lormer continued to work with the test during the early ig2o’s. It was 
subsecjuentlv 1 estandai di/ed in tlnee forms with Tennessee Valley Au¬ 
thor it\ wotkeis (<>12). Unfortunately none ol the work done by O'Rourke 
has been published, leaxing us entirely dependent lor our understanding 
ol its dcNclopmci’t upon Fixer’s hi iel account ol its origin, l oops’ dis¬ 
sertation, O’Rourke’s and Pritchett’s unpublished dissertations, and the 
sketchx data published on the test lorm and scoring key. 

A fiplu ulnhtw 1 lie cixilran edition ol the* test prepared b\ O’Rourke 
was hist used with bo\s in their late teens who were interested in enter¬ 
ing “mechanical” occupations. Just which occupations were included 
under this heading is not indicated, but the- tact that his contempot ar v. 
Thorndike, classified wrestle’s as mechanical workers (828:21) suggests 
that a word ol caution in accepting the designation may be warranted. 
The- group on whom the militarx form had been standai di/ed were 
dial tees, theiefore rnosth \ citing men: the civilian group were aged 15 
to 2 j, were no longer in school, and none of them had completed more 
than one* \ear ol high school. The second standardization of the civilian 
lor m was on woikmeii who applied for mechanical jobs with the Ten¬ 
nessee Yallex \11tl101ity. Again the term “mechanical” is not specific alls 
defined, but a list ol occupations for which mean scores are provided 
includes apprentices as well as journeymen, in such fields as automobile' 
mechanics, boiler making, carpentering, machine-shop, painting, and even 
textile manufacturing. This suggests that O’Rourke’s definition of the 
icum mechanical is as broad when applied to occupations as it is when 
applied to the t\pcs ol inhumation which make up the content of his 
lest. 

Most important, horn the point of \iew of the applicabilitv and use of 
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the test, is the fact that the means and standard deviations for older 
adolescents without mechanical training (the original norm group), for 
adult men with mechanical and other skilled and semiskilled training 
(TVA), and a group of 785 adult men in a WPA educational program 
in California (329) are approximately the same. This suggests that the 
test is probably equally applicable to older adolescents and to adults. In 
view of the evidence which suggests an effect of experience on the Min¬ 
nesota Mechanical Assembly Test this seems surprising, but it is perhaps 
due to the fact that by their middle teens boys who have mechanical apti¬ 
tudes and interest learn as much about the tools and processes tested 
as they ever will. It is conceivable that the additional trade knowledge 
gained after that time is in specialized fields and of an advanced type 
which does not affect general “mechanical” information such as is 
tapped by this test. As age differences have not been studied as such, it 
is not possible to give an adequate answer to the question of the effect 
of age and experience on this test. 

Content. The test consists of two parts. The first is pictorial; the 
subject matches pictures in order to show which tools and other objects 
aie used together. The second part is verbal; it is a multiple-choice test 
concerning tools, materials, and processes. As stated abo\e. the term 
“mechanical” is broadh toncei\ed to include mechanics, electiicitv, 
carpentry, cabinet-making, painting, piinting, surveying, and other ac 
ti\ities, the items being of a type which might be learned in eveivdav 
activities, without actual technical training. No rationale is ofleied lot 
the proportions allocated to each field, although these vary greatly. 
Fable 17 shows, for example, that Form A includes 19 auto mechanics 
items, if) carpentry, and 19 electrical, only 1 drafting, 1 buck laying, 
and 1 painting, but no plastering or shoe repairing items. At the same 
time. Form B contains 2.}, if>, and 9, j, <>. and o, and 1 and 1 items 
in each of these same categories. This seems likely to lessen the equiva¬ 
lence of the three forms, although no notice seems to have been taken 
of the fact. 

Administration and Scoring. The two parts require 30 and 2r, minutes 
of working time, respectively, with a brief practice period at the begili¬ 
ning. Both parts must be used, no norms being available for the subtests. 
The test requires somewhat more supervision than the average group 
test, because it is arranged in folder form which confuses many examinees, 
and because the time limits are excessive for many high school students 
who finish Part I and proceed to work on Part II before instructed to do 
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Table 17 

o’rourke mechanical aptitude test 


Number of Items in Each Form of the Test by Different Occupational Activities 
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so. The test is frustrating to girls and to bo\s without mechanical inclina¬ 
tions who led that it is unreasonable to tequire them to sit lor an hour 
o\er questions the\ cannot answei in am amount ol time. Storing is bv 
means ol an old-lashioned stencil whitli is placed against the answer 
spates in the test booklet. Unless the test is re\ ised lor special answei 
sheets and punched stencils it is likely to lose its market to some more 
administrable test ol the same type. 

X or ms. The original norms, published in 1926, were based on 9000 
boys aged 15 to l\j who weie “entering mechanical occupations.” As has 
already been pointed out, the meaning of this phrase is not made clear: 
the individuals in question may have been mere applicants, many of 
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whom were rejected, or they may ha\e been successful trainees; the occu¬ 
pations may have been semiskilled, or they may have required consider¬ 
able insight and knowledge. That their educational level was not high 
is shown by the fact that none had gone beyond the first year of high 
school, but in 1926 that meant only that they had as much education as 
the average adult male. 

The TVA norms were based on 70,000 men who applied lot so-called 
mechanical jobs, an unusually large standardization group. These nouns 
differ little from the earlier set, the mean of the adolescent group being 
a raw score of 198, that of the adult group 190 (equivalent to the rjjtli 
percentile in the early norms). The lowest quartiles are 1(12 and 137, 
respectively, and the thud quartiles are 2 j5 and 2.J2. The adult gioup 
includes mote low-scoring cases than the adolescent gioup, perhaps 
because ol loss of speed with age, perhaps because ol regional differences 
in populations. The age distribution of the adult group is not given, but 
the rural southern localities from which the latter of the two gioups 
came, as described by Pritchett ((>12), suggest that at least the latter reason 
may apply. Hanman’s (929) studs was based on California men aged 20 
to Or,, with a mean age of .jo, and found a distribution of scoies like that 
ol the original norms, which suggests that age differences arc* probably 
not the cause. 

O’Rourke’s manual also provides norms for 33 specific ocaquations in 
the TVA population, ranging fiom automechanic: apprentices and 
journeymen, through foundrymen and plasterers, to textile woi kei jout- 
neymen (just which of the Dictionary of Occupational I itles’ more than 
1800 different textile woikcrs is not specified), weldeis, and wood wot kers, 

I his is an unusually large number and variety of occupations for which 
to provide norms, and in this respect O’Rourke lias set an example for 
other test authors. Unfortunately, however, there are serious hidden 
delects. One of these, poor and at times even meaningless occupational 
classification, has just been pointed out: it is impossible, without more 
data on some of the jobs, or without reference to a standard classification 
system such as the Dictionary of Occupational Titles, to know what the 
norms mean. A second defect is the provision ol only means and sigmas; 
this is much less serious, but if the numbers are adequate, more specific 
norms could easily have been proxided. With no indication ot the num¬ 
bers in each category, it is impossible for the test user to know whether, 
as in the case of the MESRI occupational norms, the data arc merely 
suggestive. 01 whether they can really be used lor norms. The importance 
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of this point is brought out by the fact that the means are sometimes 
very close together, and sometimes even in the reverse of the expected 
order. For example, millwrights make a mean of 199 whereas for machin¬ 
ists the mean raw score is 211, and truck and tractor operator apprentices 
score three points higher than journeymen. Differences such as the 
former probably reflect in part the composition of the test which, as 
pointed out in the discussion of content, is very unevenly weighted for 
the various fields it taps; the latter type of difference is presumably due 
to sampling errors. Doth lessens one’s confidence in the value of the* 
norms, which may be useful as a rough indication of validity for the 
directional counseling of adolescents (see below under Occupational 
Differences) but which can hardly be used for the counseling or selection 
of individual adults without more descriptive data and detail. 

Standm dization and Initial Validation. According to Fryer (277), in 
their woik with the Army Mechanical Aptitude and General Trade Tests, 
and in their subsequent dissertations with these instruments, O’Rouikc 
found conelations between the two Army tests and ratings of the mechan¬ 
ical abiljtv of high school bo\s of about .30, and O’Rourke and Toops 
found correlations with school grade's which ranged from ,jf> to .41. 
Correlations with Army Alpha were .30 for the Army Mechanical Apti¬ 
tude Test and .42 for the Army General Trade Test, based on a group of 
20S 8th grade boys. Correlations of the same tests with the Stemjuht 
Mechanical Assembly Test were .ji and .33, with the Stcnquist Picture 
Test I .4 | and .27. and wdth the Stcnquist Picture Test II ..j6 and .33, 
based on 1 15 Sth grade bo)s. 

The two Army tests were validated on student-soldiers awaiting return 
to Chilian life after World War I and on junior high school groups 
studied by O’Rouikc and Toops, data from whose dissertations are pro¬ 
vided In loser (277: Ch. 8). The former were rated for achievement after 
the completion of courses, the numbers ranging from 24 to 61 per course 
For the automotive course the validity coeflrcient for the Aptitude Test 
w T as .0,3, but in electrical and machine-shop courses they were .30 and 
.43. Comparable validities for the Trade Test were .20, .53, and .47. For 
the 208 junior high school boys the two tests had correlations of .33 and 
.41 with grades; for the 100 boys w-ho subsequently entered high school 
the validities were .16 and .32 when only the reliable grades w 7 crc used. 

Work dealing specifically with the standardization and validation erf 
the final published \ersion of the O’Rourke test has not been published 
Fryer (277:270) states that the O’Rourke is a modified version of the 
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Army Mechanical Aptitude ancl General Trade Tests. Pan II, for exam¬ 
ple, consists of Go multiple-choice questions rather than 50 one-word 
completion items as in the T rade Test from which it originated. I11 view 
of these changes, considerable restandardization must have been clone. 
All O’Rourke tells us, however, is that the published form is based on 
9000 fifteen- to twentv-four-vcar-old males no longer in school and enter¬ 
ing mechanical occupations. There is nothing on icliability. Concerning 
validity, the manual states that “correlations reported between test scores 
and ratings in vocational courses are as high as .8j; lie tween test scores 
and ratings in school vocational classes .83.“ These are, it should lie noted, 
cited as maximum validities obtained; they are considerable' higher than 
the best \alidities reported for the two Army "Tests; they are also consid¬ 
erably higher than the validities of single tests generally prove to be* an hen 
they are cross-validated. They cannot therefore be taken as indices of the 
actual \aliditv of the O’Rourke test. Judgments concerning its validity 
must be based solely on inferences from the Arm\ tests and on the pub¬ 
lished reports of subsequent investigators. 

Reliability. The reliability of the O’Rourke Mechanical Aptitude 
'Test has nevei, so far as this writer has been able to ascertain, actually 
been established. Bingham (<jj) estimate’s that the standaid eimi oi meas¬ 
urement does not exceed 18 raw-score points, or less than one-half sigma, 
but this is just an estimate. As the* .jo item Army Genetal Trade Test 
had a reliability of .cj8 (277:268) it seems piobable that the longer 
O’Routke is also tcliable. 

J r alidity. The intercorrelation of Parts I and II of the O’Rourke is 
A- (PJ: 0 - I be O’Rourke has been correlated with mtcUignu e in an un¬ 
published study b\ the writci, who administered it and the Otis S. \. Test 
to 108 high school junior and senior boys, the resulting coefhcient being 
.23. Sartain (669) reported a correlation of .16 lor the* same te sts adminis¬ 
tered to 46 aircraft factory inspectors. 

Other mechanical comprehension tests with which the* O'Rourke has 
been correlated include the Slcnquist (686), data having been obtained 
from 11 j 7th ancl 8th grade boys: r = .377; the Minnesota Mechanical 
Assembly Test (unpublished data of the writer’s) administered to 30 
7th grade boys: r = .65; and the Bennett Mechanical Comprehension 
Test (493), used with 1 j7 high school and defense-training students: 
r = .55. Sartain (669) also reported an r of .55 between the O’Rour ke ancl 
the Bennett. McDaniel and Reynolds (193) found a correlation between 
the O’Rourke and the MacQuarric Test of Mechanical Aptitude as high 
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as .51, but in Scudder and Raubenheimer’s study (G8G) oi junior high 
school boys the conelation was only .01, a difference which it is diflicult 
to explain without more data. Sartain’s (GGy) data tend 11101 e toward a 
lack ol relationship with the MacQuarrie (r — .20). 

The only spatial visualization test with which the O’Rourke has been 
correlated is the Revised Minnesota Paper Form Board, in studies bv 
Tuckman (87G), Sartain (669), and the waiter (unjniblished data). These 
coefficients were .40, .09, and ..jp the subjects oi the hist study being 
clients of a Jewusli Vocational Service, those 1 of the second experienced 
factory inspectors, those of the last high school boys in the junior and 
senior classes. The diflerences in degrees ol relationship are probably 
due to diflerences in mechanical experience. 

Interests ware related to the O’Rourke by Leflel in an unpublished 
master’s thesis (jGo). The subjects w r ere 121 boys in the junior and senior 
years oi high school. I he coi relations with Strong’s Vocational Inteiest 
Blank wcie .42 for the Chemist key, ..jG for the Engineer ke\, .27 for 
Mathematics and Physical Science T eacher, and approximately —.23 for 
the keys for Social Science Teacher, Lawyer, and Ceitifiecl Public Ac¬ 
countant. 

(hades were used as criteria in a study of 114 7th and 8th grade bo\s bv 
Scudder and Raubenheimer (G8G), with a reported correlation ol . 1 
between the O’Rourke test and grades in shop courses. McDaniel and 
Reynolds (jcg;) used insliuctors’ ratings ol 1 pj high school and delense- 
training-c ouise students. The multiple con elation coefficient between the 
battery and ratings was .47; the validity of the O’Rourke alone was .2G, 
no other test basing a closer relationship with the criteiion. In a thiid 
study, Ross (G51) tested an unspecified number of machine-tool ttainecs 
in the Paiker Defense T raining Program at Grcemille, South Carolina. 
He established critical scores lor the tests used, that for the O’Rourke 
being 175; this scene would ha\c eliminated G7 percent ol the failing 
trainees, together with onh 7 percent ol the successes. The ciiteiicm of 
success was grades in the training courses. T he conelation with scores on 
the O’Rourke was not ascertained. A stnch conducted in an aviation 
machinists school by the U.S. Navy during World Wat II (784:2)7) used 
grades as a criterion. The validity was .65. Other Naw studies used 
custom-built tests of similar t\pe. It should be noted that, although the 
O’Rourke Test of Mechanical Aptitude is thus shown to ha\e some 
validity for predicting the quality of work done in mechanical couisev 
and about as much validitv as other available tests, it gives a considerably 
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less accurate estimate of achievement than is suggested by O’Rourke’s 

partial data. 

Success on the job was studied with ail a alt factory inspectors by Sar- 
lain (G(kj) and with Tennessee A alley Authority workmen by Piitchetf 
(t>ii?). Sartain’s report is unioi innately very biief: he ]>rovided no infor¬ 
mation as to the tvpe of inspection or materials inspected, although theie 
ate piobably very important dillerences in the psychological and techni¬ 
cal demands made upon inspectors ol fuselages on the one hand and ol 
engines on the otliei: the sex and ages of the woikers are not specified; 

]epresentatheness of the sample is assumed without evidence other than 
the* I acts that “some of them” were relativelv new and “many” were* 
among the most experienced in the department. Two criteria were used: 
tatings (we aie not told of what) in a refresher course (subject-matter 
not specified), the two insti tutors of which were in most cases familiar 
with the job pet form an cc of the inspectors, and met it ratings made by 
supeixisois during the year following the refiesher tiaining. There were 
id employees in the early giotip, and 20 still on the job one year later. 
T he con elation between idlings b\ the two instructois was .77, which 
compaies favoiabh w r ith the 1 eliabilitics of latings in genet al. When 
correlated with the combined mciit ratings made dining the subsequent 
year, the coefficient was . |2. In one sense this is a reliability coefficient, 
because both sets of latings weie based partly on job performance-; in 
another sense- it is a -validation ol the ratings gi\cn in lclresher training, 
for it shows that they were positively related to ratings of subsequent 
job performance. Sartain did not repeat the cot relations between tests 
and merit ratings one year later, pci haps because the number of cases 
was by then reduced to 20. With latings in lelieshci tiaining tlu- correla¬ 
tion for the O'Rourke- test was .2 j, as compared with .42 for the Bennett 
T est of Mechanical Comprehension. .47 for the Minnesota Paper form 
Board, and .(ij foi the Otis. 

In view of all the unknowns in this investigation, ranging from the 
natuie of the work, through the characteristics lated, to the similarit\ 
between refiesher training and the job itself, it is difficult to evaluate 
Sartain’s findings. It may be safe to assume, in view of the findings ol 
other studies, that the high correlation between intelligence and ratings 
was due to intellectual factors which are more important in training than 
on the job; moderately high correlations between spatial relations tests 
and ratings suggest that the inspection job, 01 at least the rcfreshei 
training related to it, requited ability to visualize spatial relations; the 
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lower validity o 1 the O’Rourke suggests that, at this stage of experience, 
general mechanical interest and information are less important than 
spatial visualization and intelligence in this type of work. Generalization 
to possible use of the O’Rourke in the selection or guidance of inexperi¬ 
enced workers is impossible, however, not only because the type of inspec¬ 
tion work and tiaining was not described, but also because the role of the 
factors measured by the O’Rourke may be quite diflerent at the novice 
as contrasted with the journeyman stages, d his was seen to be the case, 
lor example, with manual dexterity tests and department-stoic wrappers. 

Pritchett’s dissertation (hi2) might be expected to deal mote directly 
with job success. His data are based on the administration of the 
O’Rouikc to 70,000 applicants for skilled jobs with the TYA. The criteria 
were efficiency ratings, promotions, demotions, and lay-offs. But no evi¬ 
dence is given, beyond a brief statement to this effect. 

()(cu Rational dijln cjk rs in scores can the O'Rourke Mechanical Apti¬ 
tude lest ate shown in the manual, by data obtained in administering 
the test to applicants lot 1 \ A emplovmcnt. Hrgh-scoi ing occupations 
include journevman electric rails, machinists and sheetmetal woikcrs with 
mean raw scores equal to appioximaicly one standard deviation above 
the* general mean (imp to 22S); apprentices in each of these fields are 
generally somewhat lower than journeymen. Low-scoring occupations in¬ 
clude watchmen, found} vmen, textile woikcrs, and plasterers, all more 
than one sigma below the* general mean (raw scores from 140 to 147). It 
is noteworlhv that auto mechanics, mechanic-millwrights, plumbcts. and 
carpenteis make mean scores not significantly higher than tire general 
average. This is presumably a reflection of the lact that the items in the 
O’Rourke test .sample a variety of skilled trade subjects, some fields being 
more heavily weighted than others. U’e have seen that mechanical and 
electrical items are most numerous in Part II, and that foundry and 
cabinetmaking aie barely represented; it is only logical, then, to find 
carpenters and machinists making higher scores than foundry-men and 
catpenter-finishers. As a trade test for selecting skilled workers the 
O’Rourke is, therefore, inadequate: there are too many irrelevant items 
for most trades, and not enough relevant for others. As a test for measur¬ 
ing underlying aptitude in experienced workers it leaves much to he 
desired, since an electrician’s score, for example, is heavily weighted bv 
his experience with many items, whereas a loundryman’s score is rela¬ 
tively little affected by his training and experience, onlv one item in Part 
II being directly relevant As a general mechanical aptitude test lor 
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untrained adolescents, on the other hand, the test seems much more 
appropriate, for this group has more opportunity to follow up interest 
in and aptitude for met hands and electricity than lor foundry work, and 
differences in general information in these areas may more legitimately 
he taken as indicative of differences in aptitude and interest. It would 
seem worth while, however, to develop a mechanical or technical infoima- 
tion test which sampled each of the major fields adequately enough to 
\ield part stoics which would he diagnostic ol special aptitude 01 pio 
ficiencv (depending upon the age of the examinee) in the various fields 

In an unpublished master’s thesis, Leffel (pio) classified ilm high school 
juniors and seniois actoiding to the occupational fields which they 
named as their objectives. The boys who planned to enter technical 
professions or semi-professions made significantly higher scores on the 
O’Rourke than did those who planned to enter other fields, while those 
who planned to entei social science occupations made significantly lower 
O’Rourke scoies. 

Job satisfaction has not, so fai as the writer has been able to determine*, 
been used as ciiteiion for the* O'Rourke test. It would seem logical to 
expec t those who ha\e a high degiee of mechanical aptitude to be dissat¬ 
isfied without outlets foi it, and to expect those* whose woi k lecjuiies 
more such aptitude than they ha\e to be dissatisfied with their too-de¬ 
manding work situations. 

Use of the O'Rourke Mechanical Aptitude Test in Counseling and 
Selection. I he findings discussed in the preceding sections show that 
the O’Rouike Mechanical Aptitude lest is onl\ slightlv correlated with 
intelligence, and that it has a moderatelv high coil elation with olhet 
mechanical comprehension tests, with tests of spatial \ isuali/ation, and 
with measiued interest in mechanical and scientific activities. It is therc- 
ioie possible that the accpiisition of mechanical infoi mat ion such as is 
measured by this test is the result of spatial aptitude, technical inteiest, 
and piesumablv opportunity unloi tunatelv, no studies have been made 
which jjiovc* causation. From the practical point of view, however, the 
lelationships between the O’Rourke and tests of these other lac tors is 
low enough to warrant using it in a batter) of tests for appropriate per¬ 
sons and lor suitable purposes. 

Changes in scores uutti age after mid-adolescence have* not been 
brought out by the norms, but this may be due to failure to make a re¬ 
fined analysis of age differences; the on!) data are the similarity of the 
means of older adolescent and adult groups. This seems surpr ising in an 
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information test, but it could be due to the fact that the items in the test 
tap a low level of information which is generally acquired from miscella¬ 
neous sources during adolescence, rather than the higher level oi techni¬ 
cal information which is learned in training or on the job. That this is so 
has not been demonstrated, but the existence of two such levels of infor¬ 
mation has frequently proved to be a good working hypothesis in test 
construction, and its use by O'Rourke is implied in the sub-title, “Junior 
Grade.” 

The occupational significance of the O’Rourke test can on!) be broad, 
because of the unbalanced heterogeneity of the so-called mechanical 
items it contains and because of the resulting dislocation of the occupa¬ 
tional nouns. I* vidence with both adolescents and adults indicates that 
the* test has some value in distinguishing those who have some aptitude 
lor technical work from those who have little such .aptitude; it does not, 
however, make possible differential diagnosis or prediction within the 
field of tec hnic al wot k. 

In schools , technical institutes, a fid colleges, the O'Rourke test should 
prove* most useful with those who have had no training and no systematic 
experience in technical fields. In such instances it will reveal the extent 
to which the person in question has sought and utilized opportunities foi 
the exercise of technical aptitudes and interests. It will not lulp in deter¬ 
mining in which of the various technical fields he is likely to do best or 
find most satisfaction, hut it does have value in general directional guid¬ 
ance-. It can normally be expected to improve the selection of high school 
students who will do well in technical courses, but is not likely to predict 
success as well as the test manual implies. 

In guidance centers and employment the possible uses of the test are 
about what thev arc- in educational institutions. It can be useful in select 
mg promising young trainees 01 entry workers for industrial employment, 
supplementing the history of mechanical and related interests and 
ac tiv ities. 

In industry the O’Rourke test is also useful in selecting young people 
lor entiv jobs and for training opportunities, as a measure of previous 
exposure to and profit from incidental technical experiences. Although 
it cannot properlv be used as a trade test, it has been shown to have some 
value as a screening device even for experienced workers on technical 
jobs, when large numbers have to be employed and the evaluation of 
experience is difficult. In any case, the O’Rourke should be supplemented 
by purer and less easily contaminated tests of aptitudes such as intelli- 
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genre and spatial visualization, and, in the case of experienced woikcrs. 
by trade tests; and it should go without saying that personal data should 
also be utilized. 

The Bennett Test of Mechanical Comprehension (Psychological Corpo¬ 
ration, nyjo) 

This test of mechanical aptitude was developed alter a sur\ey of exist¬ 
ing tests of mechanical aptitude led the author to the* conclusion that 
there was a need lor a test which would measure a higher ordei of 
mechanical aptitude than that assessed by available tests. The facts con¬ 
cerning the Minnesota and O’Rourke tests, summarized and discussed 
in the preceding sections of this chapter, partially substantiate that 
conclusion, as would those concerning other tests of mechanical aptitudes 
were they similarly treated. 

Applicability. There are three forms of Bennett’s test: AA. designed 
for high school students, engineering school applicants, and other tcla- 
lively untrained and inexperienced groups, most widely used and then 
lore selected lor detailed treatment in this chapter; BB, more diflicult 
and designed for use with engineering school applicants, candidates 
for technical courses, and applicants lor mechanical emplovmcnt; and 
Wi (developed in collaboration with Dinah E. Fry), designed loi use 
with high school girls and women. An attempt was made* to devise items 
appropriate to the aptitude and experience of each of these tv pcs ol 
groups. In the case of the women’s lorm, for example, items used embodv 
what seem to be the same types of phvsical principles, but the* objects and 
situations are such as are more commonly encountered by women than 
those in the men’s forms; they involve the kitchen and the sewing room 
more than the shop and the garage. That this goal of devising items 
suitable to the group in cjuestion was reasonably well attained is illus¬ 
trated by the fact that 9th grade boys make raw scores which lange lrom 
5 to yp with a mean of 31, whereas 12th graders’ make scores ranging 
from ri at the first percentile to 57 at the 99th, the mean being ^9. As 
the total number of items is bo, this demonstr ates that most ol the items 
are actually working at this age range, and that the improvement which 
takes place with age in adolescence does not make the test too easy. 
Freshmen engineers, on the other hand, make raw scores of 56, 57, and 
59 at the 90th, 95th, and 99th percentiles, and a raw score of 47 at the 
50th percentile, showing that that test is so easy for freshmen engineers 
that the nrost able cannot show the true extent of their ability. Form AA 
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is suitable lor engineering school applicants, as the author states, lor in 
such a selection program the principle objective is to screen out those 
who are too weak rather than to locate those who are unusually able; 
in a scholarship program, however, it would be better to use Form BB, 
thus achieving discrimination at the top and locating the most able. 
Another indication of the suitability of the special forms lies in the fact 
that women’s scoies average about 12 points lower than the scores ol 
comparable men on the men’s form ((hj). 

The question of the effect ol having studied physics upon scores on 
a test such as Bennett’s is frequently raised, since the items measure 
understanding ol and ability to apply physical principles. Two studies 
have investigated this problem, both reported in the manual. In one 
study 315 applicants for defense-industry training answered a question 
concerning previous training in physics. I he 220 persons who had had 
such training made a mean score ol p.7, while the 95 reporting no 
training made a mean score of 39.7, diflercnce which was in the expected 
direction but not great enough to be statistically significant. Expressed 
in percentiles, one group was at the both and the other at the 70th 
percentile, both ol which can be* thought of as a\eiagc. lour raw- 
score points (equal to less than one-halt sigma) generally make less differ¬ 
ence than this in percentiles, the difference can be thought of as practi¬ 
cally insignificant also. A similar anahsis was made ol data obtained from 
1 j71 candidates lor positions as firemen and policemen in New York City; 
the biserial r between having had training in physics and score on lien- 
nett’s tests was .2b. and the difference in the means was again four points 
01 less than one-hall of one standard deviation. 

Content. The items ol the Bennett l est ol Mechanical Comprehen¬ 
sion. unlike those ol the O’Rouike, are objects which arc almost uni¬ 
versally familiar in American culture: airplanes, carts, steps, pullevs, 
windlasses, see-saws, and cows. I11 this respect the test is presumablv less 
subject to the cllects ol differences in experience and environment than 
is the O’Rourke. This is probably also true of what the examinee must 
do with the objects in order to take the test, for the tasks require com¬ 
prehension of the nature, operation, and eflects of various physical prin¬ 
ciples rather than knowledge of specific tools or items of equipment and 
their uses. To put it concretely, in Bennett’s tests it is not a matter of 
what tea use a pulley for, but rather one of how weight is distributed on 
pulleys when they are used. The only knowledge needed foi the latter 
type of item is an idea of the general nature and use of pulleys; the 
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answer can be found by logical analysis of the problem, lliat is, by 

mechanical comprehension. There are a total of f>o such items. 1 he 

existence of a sex difference equal to one and one-half sigma (manual) 

shows that cultural factors a fleet even this test, but it seems likely that, 

for a given sex, they are less important, as witness the data on physics 

training. 

Administration and Scoring. The test has no time limit, being de¬ 
signed as a power rather than as a speed test. The majority finish in less 
than 25 minutes, and a 30-minute time limit is ample for almost any 
group. Booklets are used a number of times, a special answer sheet being 
provided for responses. Sample problems help orient the examinee to 
the methods and forms. Scoring is by means of stencils, either by hand 
or in the IBM scoring machine. Both administration and scoring air 
simple and expeditious. 

Xornis. For Form A \ three sets of norms are available, one for educa¬ 
tional groups, one lor occupational groups, and one for women. For 
Form BB they consist of data for technical educational groups and ap¬ 
plicants for mechanical work. T he women’s form has educational and 
occupational norms. 

The educational norms (Form AA) are for each of the four years of 
high school, each \eai being based on from 300 to 833 boys; for technical- 
high school seniors; for introductory engineering school freshmen, there 
being from 402 to 613 cases in each of these last groups. The means in¬ 
crease from )ear to year, and from less-selected to more highly selected 
grou ps. 

The so-called industrial norms are in some cases more truly educa¬ 
tional, as when they are based on candidates foi WPA mechanical courses 
or on clients of a veterans guidance center (veterans are not an occupa¬ 
tional group, but a cross-section of }oung men). In other cases they are 
marginally occupational, being based, for example, on candidates for 
positions as policemen and firemen (occupational nouns could ha\c been 
obtained by excluding those not actually appointed), candidates for 
apprentice training, candidates for engineering positions (as their average 
education equalled two years of college they could not be considered 
engineers without substantial appropriate experience), and applicants for 
jobs as mechanics’ helpers, unskilled laborers, and lcadmen. Only two 
groups are truly occupational, the paper-factory workers and bus and 
street-car operators. The numbers in each of these categories range from 
145 candidates for engineering positions to 2217 applicants for employ- 
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mcnt as mechanic’s helper. The two strictly industrial or occupational 
groups number respectively 1G37 and 734. 

While the numbers are generally sufficiently large, there is no way of 
knowing how representative they are: the schools and colleges from 
which the educational norms were obtained arc not specifically identified, 
although some can be guessed from the list of acknowledgments in the 
manual, and the prc-occupational and occupational norms are not 
described as to location, number of companies, age, or other variables, 
although here again one can identify some groups by deduction and ob¬ 
tain further information bom the original studies: the defense-course 
trainees, for example, appear to be Moore’s (53b) cases, while the paper- 
factory workers arc a group tested in a Savannah, Georgia plant but not 
described in any detail (72). 

Women’s norms lor Form A A are based on one small group of college 
freshmen (X = tit), a moderately large group of wartime applicants at 
an employment agency (X — 23H), and 1090 trainees in an airplane 
factors. With no othei information concerning the college and employ¬ 
ment agent v groups then norms are of little value, for other colleges 
may ha\c different types of students and women job seekers are not the 
same in peace and in war. The airplane factory workeis constitute a 
large and. judging by other information about women workers in 
wattime ait plane 1 factories, heterogeneous enough group so that they can 
be of some use. The limited norms for women are perhaps not too im¬ 
portant in any case, since women do not ordinarily compete for mechani¬ 
cally demanding jobs in peacetime and, when they do, must hold their 
own with men. In a period of industrial mobilization for war production 
the* opposite is, of course, true, and an instrument which can select 
me chanically apt even though inexperienced women is of great value. 

As the manual has been ie\ised by its author in order to keep it up to 
date (in less detail than one might wish) and he and his associates ha\e 
continued to publish new studies involving the test, it can probably be 
assumed that the delects in the norms will be progressively minimized, 
and that in clue course both more representative samples and more 
adequate descriptions of the samples will be made available. 

Standardization and Initial Validation. As described in the manual 
preliminary work with this test consisted of preparing rough sketches of 
proposed items and trying them out on various types of persons. After 
elimination and revision of items 75 were tried out in booklet form. As 
a readily available criterion for the retention of items in the test, scores 
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on three existing tests of mechanical aptitude were combined with the 
Bennett scores: these were the MacQuarrie, the Detroit, and the Revised 
Minnesota Paper Form Board. The responses ol the highest and lowest 
scoring 27 percent of a group of 283 applicants for skilled technical train¬ 
ing were used for item analysis, just as in an item analysis of one test by 
itself but with the advantage ol additional items designed to measure the 
same trait to help differentiate the most able from the least able can¬ 
didates. This is therefore neither an internal consistency validation not 
a validation against existing tests, but rather a mixture of the two. As a 
result of this procedure the number of items was reduced to bo, plus two 
easy items which were retained as practice questions; having survived 
such an analysis, these items can be presumed to be measuring the same 
trait or constellation of traits, to be measuring something resembling 
what others have called mechanical aptitude, and, if these other tests 
have some validity for measuring promise in mechanical work (as they 
do), to have some validity as a test of mechanical aptitude. Such indirect 
proof of validity is not satis! act ory in and of it sell, but it su Hires as a 
first step, the successful taking of which then justifies the labor ol \alidat 
ing against occupational criteria. 

Reliability . The only reported reliability coefhcient located by the 
writer is that given in the manual, .8 j foi a group of you 9th grade 
boys, calculated by the split-half method. This is sulheienth high, espe¬ 
cially for such a homogeneous group; it would presumably be higher 
if the age and ability range were greater. 

Validity. Because of the strength of its rationale and the* consulting 
activities of its author, the Bennett Test of Mechanical Gompichension 
has been used in a surprisingly large number ol studies, including 
several in the Army, Navy, and Ail Force which ha\c not \et been 1 e- 
ported in the general literature. Criteiia used ha\e included not only 
other tests, but grades and supervisees* ratings; output and other objec¬ 
tive vocational criteria haw not, howeter, as yet been utilized as criteria, 
perhaps partly because the test was designed and used primarih lor jobs 
above the semiskilled level in which success cannot often be judged bv 
production records. 

Tests of intelligence which have been correlated with Bennett’s have 
been summarized in a table in the manual. Ol special interest are the 
correlations of .25 and .45 with the Otis S.A. Test based on 1 yb high 
school and on 292 defense-training students. The* manual does not 
indicate the age or grade range of the high school students, but the low 
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correlation may be due to homogeneity; the higher correlation for the 
defense-industry tiainees is presumably due to greater ranges of educa¬ 
tion and age. Other correlations with the Otis test have been reported by 
Sartain (bbp, 671), who found a relationship of only .175 between the 
two tests with a group of 46 inspectors in an aircraft plant (presumably 
a homogeneous group) and one of .37 for 40 aircraft factory foremen 
and assistant foiemen. The relationship with the A.C.E. Psychological 
Examination reported in the manual for 212 technical-high school 
seniors (appaiently in Spi ingfieJd, Mass.) is .55; working with 230 
Merchant Marine Cadets, Traxler (863) found a correlation of .37. For 
the L score the coefficient was .3}, and that lor the Q score was .26. This 
tendency for veibal intelligence to be at least as closely related to me¬ 
chanical comprehension as cjuantitative intelligence is confirmed bx 
Carnegie Mental Ability l\*st data reported in the manual: r = .51 tor 
the L score, .r,2 lor the Q scoie, the subjects being 131 defense trainees. 
It seems that in fairly heterogeneous groups abstract mental ability is 
moderately 1 elated to mechanical comprehension (as indeed the term 
implies), whereas in homogeneous groups it is quite distinct. 'Phis makes 
its mcasuiement in technical training institutions which select largely 
on an intellectual basis especialh pertinent, assuming that the test 
a< tualh has pi edic ti\c value 

Manual dexterity tests which ha\e been correlated with Bennett’s 
include the l\x < hologic al Corporation's Large Hand-Tool Dcxteritx lest 
(disassembk and assemblx of nuts, washers, and bolts with wrench and 
sciew chi\ei), the Minnesota Manual I)exterit\ Test, and the O’Connor 
Finger and Twee/er Dexterity Tests. The first stuck is reported in the 
manual, the subjects being 89 \eterans in a guidance center and 1109 
papei bag lac ton woikeis: the correlations equalled .39 and .28. The 
Minnesota Manual (Placing and Turning) Tests and the O’Connor tests 
were used by Jacobsen (39b) in a stuck described in an eailier chapter 
lor <jo mechanic Jeaineis he found correlations of .21 and .14 with Plac¬ 
ing and ’Tinning I ests, and of — .oj and .1 | with Finger and Tweezer 
Dexteiitv. 1 espec t ixelx. It seems suiprising that there should be a rela¬ 
tionship between mechanical comprehension and gross manual dexterity 
as measured b\ the hand tool test but not as measured by an arm-and- 
hancl itio\cinent test. It would seem more logical that there be no rela¬ 
tionship at all between dexterity and comprehension, as suggested by 
Jacobsen’s data. More exidence is needed. 

Met Inmical aptitude has been measured by other tests and correlated 
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with the Bennett in several studies reported in the manual and in two 
other studies by Sartain (669) and by McDaniel and Reynolds (493). The 
test reported on in the manual is the MacQuarrie, administered to 13b 
applicants for WPA mechanical courses and to 220 applicants for ap¬ 
prentice courses, with correlations of .40 and .48. Sartain’s correlation 
coefficient was .44 for aircraft factory inspectors; McDaniel and Reynolds’ 
was .55 for 147 defense-training students. These correlations arc only to 
be expected, in view of the use of the MacQuarrie as part of the internal 
consistency criterion in selecting items for the Bennett test. 

Spatial visualization tests used with Bennett’s and throwing some light 
on what it measures are the Revised Minnesota Paper Form Board and 
the Crawford Spatial Relations Test. Correlations reported lor the 
former in the manual are consistent, ranging hom .\ \ for 2oh technical 
high school seniors to .59 for 13b applicants lor WPA mechanical courses; 
Traxler (863) reported one ol .39. but Sartain (bfupbyi) lound 1 elation- 
ships of .27 and .31, and Jacobsen (39b) reported a coefficient of .00. 
These inconsistencies are difficult to explain, but Jacobsen’s finding is 
so unlike the others that it may perhaps be disregarded. The trend then 
rather clearly is lot the two tests to be model ateh closely 1 elated, as thev 
should be in view of the use of the Paper Form Boaid in selecting Ben¬ 
nett items. Jacobsen is the only author who has reported on the 1 elation- 
ship of the Bennett to the Ciawford test, his r being only .iH. 

Interest was correlated with Bennett scores In Mooie (^30) who used 
Strong’s Vocational Interest Blank as a measure of interest. JI is subjects 
were two groups of cngineciing defense-training students, numbering 
205 and 292 respectively. The correlations between the Bennett and 
Strong’s Engineering key were .30 and .35 for the two groups; for the 
Aviator key they were .21 and .2b; for the Production Manager key they 
were .12 and .08; and for Carpenter they were .ob and .12. These findings 
suggest that the higher the le\el of mechanical comprehension, the 
higher the level of technical inteiest, for the higher cot re lations are for 
the technical occupations. This is not confirmed by the somewhat dif¬ 
ferent mechanical and scientific keys of the Ruder Preference Record 
(671), the correlations with which are only .15 and .15 for a more homo¬ 
geneous group of loremen. 

Grades in technical courses, standing on examinations in technical 
subjects, ratings of students and learners by institutors, and ability to 
complete technical training courses have been used as criteria in training 
situations. Grades made by 1834 defense industry trainees in a chemistry 
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course were correlated with Bennett scores l>y Moore, r being .36. 
For 137 shop trainees of Pan-Aniei ican Airways the correlation with 
shop gtades was Jrj. Moore also obtained correlations of .yj with 
final examination scores in defense-training chemistry courses, and .52 
with final examinations in the physics course. The latter examination 
was a Co-operative Test Service physics test; the manual also reports a 
correlation ol . ]2 between the Bennett and College Entrance Board 
Physics Examination scores of 275 applicants for an engineering school. 

Not reported in the manual are two studies of the test’s value in pre¬ 
dicting ratings of the mechanical promise of Avar industry trainees. 
McDaniel and Reynolds (jejg) used a group of high sc hool students and 
defense industry trainees, 1 jy in number. Their criterion was instructors’ 
ratings of learning aptitude, speed and accuracy in acquiring muscular 
and manipulathe skills, quality and precision of work, and eagerness in 
getting at the job and staging with it, combined into one overall rating 
of promise. Ten-point scales (too refined for use by non-psychologic all v 
trained raters) with belra\ ior descriptions were used for each of the four 
traits. No data are presented as to the reliability of the ratings; their 
correlation with Bennett scores was .2 j, approximate!) that for tire 
O’Rourke and slightly higher than those for various parts of the Mac- 
Ouarrie Mechanical Aptitude* Test. 

Jacobsen's studs (99b) has been described in connection with other 
tests, lie lound that the* correlations between Bennett scores and ratings 
of fitness lot mechanical uork as judged in courses in aircraft instru¬ 
ments, airplane engines, aeronautical repair mechanics, machine shop, 
and aircraft electricit) were .11. .go, gjy, and .jr respectively (P.E. 
equalled .07 to .op). When combined with other tests the multiple cor¬ 
relations tanged liom . j(i (repair mechanics) to .b j (instruments), except 
for ratings in the* course in aim aft engines; perhaps this was clue *o 
defects in the* lutings in this emu sc, rather than to differences in the 
psychological demands which it made on the learners. 

Bennett points out in his manual that many validity coefficients were 
obtained for his test or for \er\ close copies of it in the armed forces 
One part of the Arm) Air Force Qualifying Examination (195), consisted 
of from 15 to bo, generally 30, Bennett-type items; validity coefficients 
for \arious forms correlated with success-failure in primary pilot train¬ 
ing ranged from .1] to .38; for graduation-elimination in navigator: 
training the \aliclitics ranged horn .22 to ..gr,; and for bombardier train- 
ing, the criterion ol success for which was not satisfactory, the one valid 
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ity coefficient reported was .13. In an experimental group of 1080 cadets 
sent to pilot training regardless of test scores (ai.j), the validity coefficient 
for the mechanical comprehension part ol the Qualifying Examination 
was .47 (graduation criterion); the Mechanical Principles lest of the 
regular cadet test battery had a validity coefficient ol ..13 for this group; 
only two tests had higher predictive values, one entitled Instrument 
Comprehension II (r — . jS) and the other a Test of General Information 
scored for pilots (r = .51). 

The test was used by the Army and Navy, in \arious forms, for the 
selection of trainees in other specialties. The Army Mechanical Aptitude 
Test included 22 items from Bennett’s Form AA, plus others resembling 
it; the Navy had its own forms also. Validity data lor some of these ate 
reported by Frcdericksen (273) and by Stuit (785), but will not be (ited 
in detail here, as the Air Force data il I list 1 ate them. As Bennett’s manual 
puts it, whenever the ability to understand machines is impoitant the 
test and its derivatives ate likely to have fait I\ high validity. Navy 
technical courses lor which the Bennett tvpe tests were validated ate 
listed in Table 18, with validities. 

I ap.ll 18 

RELATIONSHIP LLTVVF.l N I'.l NNI 'll SCORES AND NAW GRADES 


Submarine School 

Coursf 

r 

T orpedocs 

■2.3 

Communications 

•23 

Submarines 

•23 

Engineering 

•39 

Indoctrination School 

Seamanship 

.j8 

Ordnance 

-2Q 

Navigation 

.36 

Final Average 

35 


Success on the fob as measured Ity ratings ol supervisors has been 
correlated with Bennett scores by Bennett and Fear (70), Me Murry and 
Johnson (500), Sartain (669,671), Schultz and Barnabas (682), and Shu¬ 
man (716,717). In Bennett and Feai’s study 60 machine-tool-operator 
trainees were tested prior to training and were rated by their supervisors 
for performance on the job several months later. The reliability of the 
criterion was apparently not checked. Test scores and ratings of job 
performance had a correlation of .69, an unusually high validity for one 
test which would need to be confirmed in other similar studies before 
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being accepted (Shuman’s study, discussed below, found an r of .44 lor 
machine operators). As a result of this finding only applicants who rated 
A or B on a combination of this and one other test were employed, the 
group as a whole making good employment records, as evidenced by the 
lad that, “of all new men hired since tests were installed 76 percent were 
rated as ‘excellent’ or ‘good’ on the job. Only S percent weie rated 
below average.’ None were rated as ‘poor.’ Not a single new man, 
lined since tests were introduced as part of the selection procedure, lias 
had to be dismissed because oi lack of ability to do the job.” Perhaps this 
conclusion needs to be qualified by a reminder of the facts that super - 
\isors are generally reluctant to use the “poor” rating, and that during 
the war employers weie reluctant to release employees. 

Further confirmation is provided by McMurry and Johnson, who 
tested ybp ordnance factor} employees at the time of selection with a 
batteiy including the Dennett test. Supervisors’ ratings of 587 of these 
weie obtained after they had been on the job some time. Validity co¬ 
efficients were computed for occupationally homogeneous subgroups. For 
a group of 33 cianeinen the Dennett test had a validity of .by other 
occupational groups were also tested, but validities are not reported for 
the* Dennett alone. 

Sar tain’s fust study has been discussed elsewhere; his ratings, it will 
he remembered, were of performance* in a refresher training com sc* for 
ail ci alt lac ton inspectors already on the job whose job performance was 
known to the instructors. The correlation between Bennett scores and 
this mixed tiaining-job criterion was .32, lower than those of .by .bp 
and .47 for the MacQuarrie, Otis, and Minnesota Paper Form Board. In 
his second studs, the subjects of which were 40 aircraft factory foremen 
and assistant foremen rated b\ their supervisors, the correlation between 
Dennett scores and ratings was —.15. This may prove that foremen in 
this plant weie judged more by success in handling employees than hv 
success in coping with mechanical problems, and is probably no indica¬ 
tion of the \alidits of the test lor mechanical and technical work. Shu¬ 
man’s studv. discussed below, suggests that in some situations the 
mechanical comprehension of foremen is considered by raters; Schultz 
and Barnabas’ investigation also bears on this point. 

Employee relations and “budget-control efficiency” of 30 foremen and 
assistant foremen were rated by supervisors in the study reported by 
Schultz and Bainabas. The ioiemen were tested with a battery made up 
of the Dennett Mechanical Comprehension Test, the Strong Vocational 
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Interest Blank (scored for Production Manager and Occupational Level), 
and the Bernreuter Personality Inventory (combined scores). The re¬ 
liability of the ratings was determined by re lating at the end of a 
five-month period. The correlation between the combined ratings on 
“employee relations” and “budget-contiol efficiency” for the first and 
second ratings was .83. When T scores for the three predictors were 
combined a correlation of .32 with combined latings was obtained. The 
c01 relation between Bennett scores and the criterion was .11. 

In Shuman’s study of aircraft-engine and propeUei factor)' wot kers the 
criterion of success was supervisors’ latings of efficiency of the job. 
Workers were rated as good, aveiage, 01 poor, in consultation with rating 
experts. In two departments the ratings thus made weie correlated with 
ratings made by a departmental instructor trained in rating techniques 
I lie reliabilities thus obtained Avert* .91 for 92 pioduction engine testers 
and .703 lor g(i inspectors; the former correlation being so high as to 
make one wonder about possible contamination ol data tlnough discus¬ 
sion by supervisor and instructor. Tests were administered to operators 
who had been on the job for six months 01 mote; ratings were secured 
after testing. New applicants were also tested at the time of application, 
and those employed were followed up six months latet and tated. These 
two groups weie combined, the possible diflerential effects of pre- and 
post-hiring testing apparently not being imestigated. The numbeis in 
each occupational group varied from 23 (job setters) to 99 (foremen). 
Biserial coefficients of correlation weie computed between the tests used 
(Otis, Minnesota Paper Form Board, and Bennett) and supervisors’ rat¬ 
ings, by occupation. Data lor the Bennett lest are presented in 
Fable 19. 


Table 19 

BISERIAL CORRELATIONS EETVVLLN JOB RATINGS AND BENNETT SCORES 


» Critical Scores Venent Improvement 


Job 

,Y 

In s 

Male 

Canale 

Male 

Female 

Inspectors 

49 


34 

1 9 

I 2 

28 

Engine testers 

4 r ) 

•17 

33 


I O 


Machine Operators 

81 

•44 

27 

18 

22 

1 2 

Foremen 

99 

.46-3 

3 ° 


l O 


Job setters 

2.5 

•73 

3 6 


47 


Toolmaker learners 

64 

.46 

3 6 


r ) 


Mean 

363 

-52 



18 



We have already seen, in the discussion of the Otis lest, that the latter 
had substantial validity for all of these jobs (1 - .39 to .37); it is intcrcst- 
ing that the \aiidities foi the Bennett ate lower in some cases (engine 
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testers) and higher in others (inspectors and job setters). It would be 
helpful, in such a case, to have job descriptions which would throw light 
on the reasons for these differences, but presumably in the engine testcTs 
work there is no advantage in having more than the minimum required 
degree of mechanical comprehension (perhaps it is more a matter of 
manual dexterity in making connections and perceptual ability in read¬ 
ing dials), while In the job setter’s and inspector’s work higher degrees 
of understanding of mechanical principles make- for greater worker 
efficiency (which is understandable if the inspectors were engine in¬ 
spectors). 

Critical scores were set for each of the tests used, those for the Bennett 
being shown in the next-to-the-last column of Table 19. In jobs utilizing 
both men and women sex differences made special minima necessary, 
in other jobs men only were employed. The difference between these 
minima indicate that the machine operators’ work requires least me¬ 
chanical comprehension (it also requires the least mental ability), and 
that job setteis and toolmaker learners are the most highly selected 
gioups in mechanical comprehension: this is what one would expect, 
and can be taken as a sign of the \aliclity of the test. Foremen, for whom 
supervision of personnel is mote ciucial than mechanical aptitude, also 
base a lower ciitical score than the more technical workers, although 
in this situation, unlike Sartain’s, mechanical comprehension does play 
some part in foreman success as judged by the supervisors. As the final 
column of Table 19 shows, the hiring of workers on the basis of the 
established critical minimum scores would be impro\cd by from 5 per¬ 
cent in the case of toolmaker learners to 47 percent in the case of job 
setters, with a mean hiring improvement of 18 percent for all the jobs in 
question. The Bennett test contributed more to the impro\ement of 
selection than either of the othei tests used, except possibly in the case 
of inspectors and toolmaker learners. 

Supervisors* workers in three factories were studied in another in- 
\estimation in which Shuman used the same battery of tests. Foremen, 
group leaders, and job setteis were rated as to production, handling of 
wotkcis. housekeeping, and cnerall opinion by their superiors, the total 
usable group numbering 208. The mean correlation between Bennett 
scoies and ratings of several groups of foremen was .55. Minimum critic al 
scores were established for each job, that for foremen being 30, and that 
for group leaders 2<i. When data for all supervisors were combined, the 
percent improvement in selection of excellent workers which would have 
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been effected by use of the Bennett test was 1S, exceeded only by the 

Otis’ 19 percent. 

Occupational differences in mechanical comprehension as measured by 
the Bennett test ate shown by Shuman’s studies (71b,717) and by the 
industrial or occupational norms reported in the manual. As Shuman’s 
basis for establishing critical scores is not described, and as he does not 
present data on means and sigmas, it is not possible to integrate his 
published findings with the industrial norms of the manual, llowevei, 
we have seen that according to him. job setteis and toolmaker learners 
require more mechanical comprehension than do other skilled and semi¬ 
skilled workers in airplane-engine and piopellcr factories, and that 
machine operators require least. The critical scores (apparently close to 
Qi) for toolmaker learners and job setters ate at the ;;oth percentile lot- 
trainees in an airplane factory, and at the 50th lot candidates lot police 
and fire department appointments, as shown in the manual. The critical 
score for inspectors was at about the irpd and ppd peicentiles when 
compared to the same groups. T hat for machine opetatots was at the 
17th and 20th. These data suggest that tlu* most skilled jobs in an air- 
plane factory require only a modicum of such ahilit\. In the- not ms 
provided by the manual, it is the candidates lor engineeiing positions 
(average education equalled two years be\ond high school) who ranked 
first, trainees in an airplane factory second, and men in defense ti.lining 
courses and applying lor leachnan jobs thiid, while candidate s for W’P \ 
mechanical training courses, workers in a papet-bag factory, and ap¬ 
plicants for employment as mechanics’ helpers made* the lowest mean 
scores. These data are still limited to too lew occupations, in too few 
plants, to be more than suggestive. Nouns for other skilled and also lot 
professional-technical jobs should be provided at an eailv date. 

Job satisfaction has not as yet been used as a miction lot the 1 valida¬ 
tion of the Bennett Mechanical Comprehension Test. 

Use of the Bennett Mechanical Coni p) chcnsmn 'Best in Counsch ng 
and Selection. The repotted 1 elationships between the- Bennett and 
other tests make it clear that, when the group being tested is homoge¬ 
neous, there is little relationship between mechanical comprehension 
and intelligence; since they are both abstract functions, however, it is 
only natural that they should appear to have some relationship when the 
groups concerned represent considerable spread in mental ability. This 
test has been seen to be closer to spatial visuali/at ion. a finding which is 
not surprising in view of the studies which have shown that mechanical 
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aptitude is in reality a combination of ability to judge spatial relations, 
perception, and inhumation. Similarly, we have seen that Bennett scores 
and technical interests as measured by Strong's Blank are moderately 
con elated, although the relationship was found to be negligible in a 
more homogeneous group of men in whom interest was measured with 
Kuder's inventory. 

The effects of age and experience on Bennett scores have not been 
adequately studied, although we have seen that partial data throw light 
on some impoitant aspects of these problems. There ate no data on the 
development of mechanical comprehension, but this is natural enough 
in a composite tiait. It has been brought out that the easier of the malt' 
lot ms is too easy lor blighter and more mature men, that presumed 
cultural influences handicap women somewhat cm the men’s form, but 
that sue h specific and pertinent environmental influences as training in 
pin sics do not appreciably aflect men’s scores: apparently older boys’ 
and men’s opportunities to become familiar with the objects and prin¬ 
ciples in\ol\cd are sufficientJ\ uniform in urban .American culture to 
make the test “universal!)’’ applicable. In this respect the test is prob- 
abl\ superior to O’Rourke’s. 

()< (itpaiional significance of Bennett’s test has been made clear in a 
N.uietv of wa\s, even though the occupational groups included in the 
published norms are in too many instances really pre-occupational or at 
best marginal. As Bennett puts it in the manual, the test is likely to be 
ol value 1 in jobs in which understanding machines is of fundamental im 
poi tance; when dealing with people or with abstract problems othei 
te'sts will have greater \alidity. Thus engineers and toolmaker learners 
are cbar ac let i/ed b\ a high degree of mechanical comprehension as 
measured 1>\ this test; good machine operators tend to have more than 
the general population; and foremen in some 1 situations (presumably the 
more technical) are found to be superior in mechanical comprehension 
while those in others (presumahlv those in which human relations are 
important) do not excel in this tiait hut are superioi in other wa\s. 

In schools and (alleges the lest can tentatisely he used with the pub¬ 
lished educational norms, but local norms should be developed as soon 
as possible in \ icw of the probable inadequacies of those in the manual. 
I he test should pio\e \aluablc in counseling students concerning the 
choice of technical curricula and occupations: it may be safe to generalize 
lrom the validity data and norms to say that those aiming at semiskilled 
machine work might he expected to make scores above the t;,th per- 
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centile of their high school class on Form AA, those considering skilled 
trades above the 25th or 45th depending upon the trade, and those aspir¬ 
ing to engineering and related professions above the 50th percentile for 
their high school grade. These suggested critical scores, it should be 
emphasized, have not been proved appropriate for these purposes: they 
are merely those which the noimative data on hand indicate might prow* 
valid. The test can also be used in the selection of students for technical 
courses, for we have seen that the test has some validity for training in 
such varied courses as machine shop, mechanics, pin sics, chemistiy, and 
military flying. In selection programs, of course, critical scoies should 
be established on the basis ol local experience and \alidities. 

In guidame (enters and employment services the three forms can be 
used as in schools and colleges, discussed abo\c, and in business and 
industry, considered in the next paragraph. The main problem in such 
centers will be the choice ol the appropriate lorm in the case of male 
clients; it should be made on the basis of an appraisal of the education 
and experience of the client, with regard to levels and quality of both 
intellectual and mechanical content. 

In business and industry the value of the Bennett Test of Mechanical 
Comprehension should be greatest in the selection of trainee's lor skilled 
technical jobs, and for semiskilled jobs in which fairly complex equip¬ 
ment is used and the induction period is longer than usual. Local norms 
and cut-off scores should be developed, as conditions and recjuiiements 
vary not only from job to job but also from plant to plant. T he findings 
reported in Shuman’s studies indicate the value the test can have when 
so used. Even when experienced skilled workers arc being selected the 
test can probably be of some value il jobs being filled require' versatilitv 
of skills and ability to apply them to constantly changing situations. In 
industrial work, as in counseling, clue consiclei ation should be given to 
other measuiable and Jess tangible factors, for we have seen that intelli¬ 
gence', interest, and personality traits also play a pan in success in ski 11 c d 
w r ork, sometimes, as in some foremen’s jobs, a 11101c impoitant pait than 
mechanical ccomprehension. 

The MacOuarrie Test for Mechanical Ability (California Test Bureau, 

•925) 

The MacOuarrie Test for Mechanical Ability was developed in 1 (fjy 
as a rough measure of promise for mechanical and manual occupations. 
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As pointed out in the imioductory section of this chapter, it is not a 
test ol mechanical comprehension as such, but a battery ol subtests each 
of which was designed to measuic some iactor which it was believed 
would be important to success in mechanical and manual occupations. 
Subtests were designed to measure spatial visualization, manual dexter¬ 
ity, and peiceptual s]j>cc*d and at curat}, on the assumption that a test 
made up of such items would measure mechanical aptitude. The test 
might well be treated in the chapter on manual dexterities, insofar as 
some of the subtests are concerned, or in the chapter on spatial visualiza¬ 
tion, when dealing with other subtests; it is consideied here because it, 
like mechanical comprehension tests, is an attempt at an overall measuic 
of mechanical aptitude, it has been widely used and, despite defects 
telated to its eatly origin and insufficient subsequent editoiial woik, 
lias held its own as a seiy useful test of mechanical aptitude. 

A f>\)h( ability . 'I he MacQuairie Test was designed for use x\ ith 
adolescent boys and gills, apparently as a tool for selection for tiade 
naming. Subsequent woi k has found that the items are equal!} appli¬ 
cable to adults, and adult norms and \alidity data ha\e been accumu¬ 
lated. The oiigin.il nouns (;,oj) onh pai t of which aie in the current 
undated manual, show that scores increase each }ear fiom age 10 to age 
1 p oi 20, the mean raw stoic at age 10 being 2(i, age 15, 57, and ages 19 
and 20, by and f>8 icspetti\el\. Mitrano (533), it is true, repotted that 
scoies dec leased with age in adolescence, a surprising finding until it is 
noted that his sample of 19- to i(’>-\ear-olds were all in 8th giacle and that 
the oldest pupils weie theicloie piobabh the dullest members ol the 
class and the least well motivated. On the other hand, Goodman's (29S) 
finding that scenes dec leased with age in a gioup ol 329 women radio 
assemble) s aged i(> to 6 j \eais is not snrpiising: 1's lor subtests and age 
l.mged from —.21 (Location) to — .3 j (Tracing); r for the total score and 
age was — e;N ^1M . — .09). \s one might expect, xounger adult subjects 
lend to do better on a speed test. Use of appropiiate norms, discussed 
beIow\ is impoitant in \iew of the age difierences which the original 
adolescent norms make quite clear. 

Content. The MacQuairie is a booklet made up of sexen subtests, the 
fust thice of which (Tracing, l apping, and Dotting) seem on inspection 
to he measuies oi manual dexterity or eye-hand co-ordination, the next 
three (Cop}mg, Location and Blocks) spatial visualization, and the last 
one (Pursuit) perceptual speed and accuracy. Because ol these difierences 
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in content most users of this test in validation studies have preferred to 
treat each part separately, a judgment which will he seen to he justified 
by the results. 

Administration and Storing. This is a group test requiting about 
one-half hour for administration. The only special precaution icquired 
is making sure that examinees turn the page when so directed at the 
end of each subtest, rather than working beyond the time limit. This is 
easily controlled by beginning at once with the directions for the next 
practice test, but when groups of more than 25 arc tested assistance is 
especially important. Scoring is more complex than for most paper-and 
pencil tests, as the staler must, lor example, examine each opening in 
the lines of the Tracing Test to make sure that the pencil has gone 
through the opening without touching the sides; a little practice soon 
makes it possible to make these inspections \ery rapidly. It might be 
noted in passing, howcvei, that a somewhat greater degice of mechanic.d 
aptitude on the pail of the test author could have insulted in machine 
or stencil-scoi ing lor the Tracing, Dotting, Location, blocks, and Pursuit 
Tests, at least when the test was slightly revised at some unspecified date 
after 1943. 

Xorms. I'lie norms pi raided in the* manual show the scores made b\ 
an unknown number of adolescents of unspecified sex at ages 10 thiough 
16, and for “a\eiage adults” of 17 and abo\e. 'I fu se aie abbi e\iated 
norms, showing onl\ the means and critical peiccntiLs lathci than the 
total distiibutions. In \iew of the continued increase in scenes horn 
age id to icj 01 20 the lumping together of all pc i sons o\er i(> might 
be questioned, unless other data showe d that the sample of older adoles¬ 
cents was inadequate as a result of elimination in the last \eais of high 
school. This has not appatcntly been actually demolish*1 feel for this test, 
but the fact that the mean average adult score leported in the manual’s 
table of norms is only (>2, as compared with those ol (>g and hiS for 17- and 
2o-)ear-olds reported in the oiigmal norms, suggests that the latter two 
groups may have been somewhat highly selected lather than representa 
tive. More debatable, in view of the data, is the lumping together of the 
two sexes in these general norms, for the norms for pan scoies, to be 
discussed below, show sex differences for some subtests. Finally, the 
failure to specify the number of cases involved in these norms is to be 
deplored, although it may perhaps be deduced from the* old manual 
that the adolescents number 1000 minus the number ol 17- to 20-year- 
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olds, and from the* new manual that the adults number 2000 01 more. 
In view of the paucity ol descriptive data by which one might judge 
the nature and adequacy of the adolescent sample iL is necessary to use 
these norms with cxticme caution. 

The adult norms supplied in the current manual remedy three defec ts 
of the adolescent not ms: the*) are specific as to sex, indicate the numbei 
ol cases in\ol\ed (1000 of each sex), and, equally important, are at tanged 
according to subtests. The significance* of the sex difletences is not in¬ 
putted, but the* 11 cud is foi women to be superior in the spatial subtests 
and in total scores; that the* women arc* not supetior in manual dcxteriix 
is sm prising, but the significance of faihue to find such a diflerence is 
not clear. One detail concerning age giouping raises a cjuestion: although 
m the* table of adolescent not ms ib-year-olcls were not included in the 
a\< iage adult gtoup and made- lowet sc01 c*s than the latter, in the table 
ol adult norms they aie included with the adults. Presumably the age 
diheiences justil\ one tieatmcnt or the othet, but not both. 11 na 1J\, as 
in the case* of the adolescent nouns, the sampling is not described. One 
thousand men and an equal number ol women might be reasonable 
1 c pi <*sentati\e ol adults m g rnnnl , ihey might lepresent some* one w*g- 
rnrfit ol adults, such as loutine clerks, quite adequately; or they might 
be a bodge-podge which can hr considered a sample ol no jhntuulaY 
inih'ctse. In \ lew of the* \ciy real c!ifhc 11 1 tics which complicate* the* estab¬ 
lishing ol adult 1101111s, the usei ol ps\c hologic al tests is, in the absence 
ol detailed dcscnptiw data concerning normatiw* groups, justified onlv 
in assuming that the* noiiiis ate based on the last-named type of sample, 
i e , a meaningless hodge-podge ol adults. Such norms can be used onlv 
with extreme- caution. 

Moie meaninglul but specialized 1101ms 0rc.* presided by Bingham 
(cj jg; 1 tip bas #, d on data lor 12.] apprentice toolmakers from if>- to 22- 
\eais-old ;mc» emploud hv the .Scowl 1 Manufacturing Co. early in the* 
1 cgjo’s. Bingham points out that these 1101111s, reproduced in Table 20. 
correspond LuiK closely to the lb-year-old norms of the original manual 
at the* mean but me hide ielati\el\ fewer high and low scores: they were, 
in lact, a moie homogeneous group such as one might expect to find 
working on one jrb in a plant with a well-tiied selection program. 

Norms for a miscellaneous group of gg} i.p to 16-y car-olds in a sec¬ 
tarian guidance* center and high school in Cle\eland, Ohio, ha\e been 
puhlisfied by Turkman (880). As lie points out, these agiee rather well 
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with MacQuarric's, and they supplement tlit* lattei by pi o\ ichng nouns 
for subtests and for eath sex. In the absence of national or local nouns, 
these should prose useful. 

Standardization and Initial 1 'ahdation. d here is relatheh little 
available on the standardization and initial \alidation ol the Mac- 
Quartie test. As pointed out in connection with the norms, the manual 
is quite inadequate in the pro\ision of detailed information concerning 
the test, the recent revision rending as though it had been written for 
untrained and unsophisticated users of tests rather than for persons who 
are familiar with psychometrics. The oiiginal article bv MacOuairie 
(f>°l) &bes little on the actual development of the test, although data on 
the reliability and validity of the final fonn are provided. The total 
score was found to have correlations with intelligence which equalled 
.20 and .002 as measured by unidentified intelligence tests. Teachers of 
shop courses rated the mechanical ability of their pupils, the correlation 
between these and the MacQuarrie scores being as high as .]8. Othci such 
correlations wcic obtained but not reported, as the reliability of the 
ratings was not satisfactory. 
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Pupils also did some undescribed mechanical work which was rated by 
judges who did not know the pupils’ identity; the correlations with these 
ci iteria were .32 and .81 for two different groups, but not enough detail 
is supplied to make possible the judging of these quite different and in 
one <ase almost unbelievable validities. This report must, of course, be 
viewed in the light ol the methods and standards of work current at the 
time of publication; at that time recognition of the importance of study¬ 
ing the criterion was much less widespread, and it was not generally 
tcali/ed to what extent supporting detail is needed lor the interpretation 
ol personnel studies. Despite its delects, it makes amply clear the fact 
that this test is one ol considerable promise, worthy of the further study 
which it has fortunately subsequently received at the hands ol others. 

Reliability. MacQuarrie (50j) reported that the reliability of the 
subtest scores was as follows; Tracing .80. 'Lapping .85, Dotting .74, 
Copying .86, Location .72, Blocks .80, and Pursuit .76. The retest relia¬ 
bility ol the total score was more than .90. The number of cases used in 
computing the total reliability was 34, 80, and 270 in three different 
groups; the groups on which the part-score reliabilities were based are 
not described. The manual makes no mention ol reliability. 

Validity. We have seen that the initial validation data published by 
the* author leave much to be desired irrsofar as detail is concerned, 
but appear promising in a general way. Fortunately, a number of studies 
have supplemented MaeOuan ie’s findings. 

1 tifrt (nrrclatmns ol the Mac Ouan ic subtests ha\e been computed by 
Goodman (299) in a factor analysis study, to be described in more detail 
below. The coefficients range from .29 between Tapping on the one 
hand, and Location, Blocks, and Pursuit on the other, to .55 between 
f racing and Dotting. The manual dexterity subtest intercorrelations 
range from .jj to .77 and the spatial relations intercorrelations from 
.72 to .,7 p while these two types ol subtests are intercorrelated with each 
other to the extent of from .29 to .} j. Correlations between dexterity and 
perceptual tests are ol the same order, but the spatial and perceptual 
tests interconelate between .} \ and .48, which suggests that the distinc¬ 
tion may be arbitrate. The factor analysis throws more light on this, by 
revealing indeed three factors, one called visual inspection (our per¬ 
ceptual ability), another spatial visualization, and die third manual 
movement, (our manual dexterity). This last factor is important in the 
Tracing, Lapping, and Dotting Tests; the spatial factor in the Copying, 
Location, and Blocks Tests, to a lesser extent in the Pursuit lest, and 



266 APPRAISING VOCATIONAL FITNESS 

to a still lesser extent in the Tracing and Dotting Tests; and the per 
ceptual or visual inspection factor is important in the Tracing, Dotting, 
and Pursuit Tests. Harrell (336) found the Dotting Test saturated with 
a dexterity factor, the Copying, Blocks, and Pursuit T ests saturated with 
a spatial factor. The subtests are not particularly pure tests, although 
the thiee spatial tests are relatively unweighted by other measured fat 
tors; at the same time, the classification into Spatial (Copying, Location, 
Blocks), Manual Dexterity (Tapping). Manual-Visual (Tracing and Dot¬ 
ting), and Visual-Manual (Pursuit) seems warranted lor interpretive pur¬ 
poses. 

Intelligence tests have been found to have correlations with the Mac¬ 
Quarrie which vary from .02 to ,6s. Horning (380) tested 25 pupils aged 
12 to 15, finding a conclation of only .02 with intelligence as measured 
by the Terman Group Test. Murphy (556) worked with 143 9th grade 
boys, finding no relationship between MacQuarric and Terman Group 
Test scores. Holcomb and Laslett (375) used the A.C.E. Psychological 
Examination with 50 engineering freshmen, found an r of .303. Morgan 
( 54 °) administered the MacQuarric and Army Alpha to boys aged 13 
through iG, each age-group including from 35 to 139 members, and ob¬ 
tained correlations of .33. .35, .39, and ,i(i respectively; it should pet haps 
be noted that the low coefficient is that based on the smallest gioup. 
Pond, as reported by Bingham (94:317), found a correlation of .38 be¬ 
tween MacQuarrie and Otis, her subjet ts being 83 apprentice toolmakci s. 
Finally, both Sartain (669) and Babcock and Emerson (33) obtained 
correlations of .62 between MacQuarric and intelligence tests, the lormei 
using the Otis with 46 aircraft factory inspectors and the latter a vocabu¬ 
lary test with 300 subjects ranging in age from 14 to 28. The last-named 
study found that, contrary to expectation, the correlation between in¬ 
telligence and MacQuarrie scores increased with age. 

At first glance, it seems almost hopeless to attempt to rationalize such 
divergent findings. But if these studies are grouped according to the 
homogeneity of the subjects the differences in the findings seem more 
reconcilable. The two studies reporting no relationship, it should lie 
noted, are probably those in which the subjects were most homogeneous; 
pupils in a shop course and 9th grade boys. Those reporting moderately 
high correlations also tend to be those which were fairly homogeneous: 
engineering freshmen, high school boys by age groups, and apprentices 
in one company. One of the investigators who reported high correlations 
worked with an extremely heterogeneous group of cases: Babrock and 
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Emerson's subjects not only ranged in age from 14 to 28, but, more im¬ 
portant still, were reached as clients of a counseling service and students 
in public schools. In the other study, Sartain’s, the heterogeneity of the 
adult workers studied is shown by a mean Otis score of 28.61 and a stand¬ 
ard deviation of 9.48, equivalent (20-minute time limit) to a mean Otis 
I. Q. of 95, minus one sigma being I. Q. 82 and plus one sigma being 108; 
this suggests that, although the adult group was small and occupational]v 
homogeneous, it was heterogeneous in aptitudes. Since it has frequently 
been demonstrated that the greater the heterogeneity of the group the 
greater the correlation between their scores on any two psychological 
tests, it may probably be concluded that the MacQuarrie Test of Me- 
c hanical Ability is relatively independent of intelligence in persons of 
similar status, but somewhat associated with it in groups of varied indi¬ 
viduals. 

Mechanical comprehension tests which have been correlated with the 
MacQuarrie include the O’Rourke and the Bennett. Scudder and Rau- 
benheimer (686) found no relationship (.01) between O’Rourke and Mac- 
Quanie scores, using data bom 114 7th and 8th grade bo\s. Sartain’s 
study (669) showed a c01 relation of .20 between the two tests, his subjects 
being 46 inspectors. McDaniel and Reynolds (494) reported a correlation 
of .51 based on 147 students in high school and defense-training courses. 
The differences in results again appear to be due to degrees of hetero¬ 
geneity in the groups, the first being probably the most homogeneous 
and the last undoubtedly the most heterogeneous. Similar data arc avail¬ 
able for the Sienquist Mechanical Assembly Test, Scudder and Rauben- 
heimer (686) reporting a correlation of .01 and Hairell (44 [) one of An. 
l or Bennett’s test the results are more consistent. Bennett (68) reports 
correlations of .jo and .48 based on 140 WPA and 220 apprentice train¬ 
ing applicants. McDaniel and Reynolds also found a correlation of . jS 
with 147 high school and defense-training students, while Sartain’s (664) 
factory inspectors yielded a correlation coefficient of .4 ^ for the same two 
tests. Underlying these more consistent findings is the fact, discussed else¬ 
where, that the MacQuarrie was a part of the criterion used to determine 
the selection of items for the Bennett test. 

Spatial visualization tests correlated with the MacQuarrie include the 
Revised Minnesota Paper Form Board. Morgan (540) and Sartain (669) 
agreed in reporting correlations of about .40 to .40, although the use of 
total scores somewhat obscures the relationship shown in Harrell’s (446) 
factor analysis, previously discussed. The correlation between Mac- 
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Quarric Copying and the Minnesota Paper Form Hoard, (or example, 
is .49 ( 556 ), showing the greater importance of spatial visualization in 
the Copying than in some of the other subtests. 

Interest in technical subjects as measured by Strong s Lngineeiing kev 
was related to MacQuarrie scores in one study, in which Holcomb ami 
Laslett ( 375 ) tested engineering freshmen. r I he relationship was lowet 
than in the case of cxpeiience-afleeted mechanical comptehension tests, 
being only . 22 . 

Grades have been used as a entenon of (he validity of the Mac Quan ie 
in junior high schools, technical schools, engineeiing college's, dental and 
nursing schools, and commercial schools and colleges. Homing (;-;8o) had 
25 boys aged 12 to 1 r, gtaded on the basis of a pioject completed in a shop 
course, and on the basis of time taken to complete the pioject. tin cor¬ 
relations with test scenes weie iespecii\cly .79 and .72, both iematkabi\ 
high. Sc udder and Raubenheimer (bSh) made a study using the glades 
of 114 7th and 8th grade* bo\s as a critei ion whic h did not, le a\ e\ ei. agi re 
with this: their validity coefficient was only .08. I nloi mnateJv. both 
studies air so skctchih irportcd as to make evaluation difficult 

Class standing achieved in technical and industrial s< /tools b\ boss 1 ; 
to 16 was the critciion employed by Moigan (5 jo), with fiom to i yj 
hoys in each a geg/oup. His multiple R was .60; that lot the Mac ()uati ie 
alone was not gi\en. J he 147 high school and defense- training students 
studied by McDaniel and Remolds weie rated lor mechanical 

a ptitude h\ their instuntors and subtest validity coefficients were cal¬ 
culated. 1 Jr. sc* weie as shown in Table 21. 

Tabll 21 


CORRELATION BETWEEN MACQUARRIE SUR- 
1 ESIS AND INSTRUCTORS’ RATINCS 


MacQuarrie 

Rating j 

Tracing 

.22 

Tapping 

-•17 

Dotting 

.22 

Copying 

.21 

Location 

.JO 

Blocks 

.22 

Pursuit 

.12 

Total 

-M 


These are certainly not impressive; that this may be due to defects in 
the criterion rather than in the test is a truism which the authors seem 
to have forgotten, for there is no discussion in the paper of the reliability 
of their criterion, and such ratings are notoriously unreliable. That this 
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may be the explanation in this case is suggested by the equally low 
validities of the other tests used, although the multiple correlation co¬ 
efficient based upon the MacQuarrie and O’Rourke subtests and the 
Bennett total score was .45. 

Grades in courses taken by aviation mechanic trainees in the Army Air 
Forces prior to World War II were correlated with MacQuarrie and other 
test scores by Harrell and Faubion (339). The correlation between this 
one test and grades in drafting and blueprint reading was .47. 

Enoincn b/g glades dining the freshman year and over the whole four 
years of college weie correlated with the MacQuarrie by Brush (122) in 
a study of mote than 100 men at the University of Maine; the correlations 
were respectively .27 and .22, with probable ei rors of .06. The best sub¬ 
test cor relations were, as might be expected, those which measure spatial 
visualization, but these also we re low, ranging bom .24 to .265 for fresh¬ 
men grades and from .iS to .27 for four-year marks. Revised Minnesota 
Paper Fonn Board scenes, on the- other hand, had validities of .42 and .43. 
Biush cites an unpublished study bv Horton in which the MacQuarrie 
yielded a correlation ol . j j with engineering drawing grades, subtest 
scenes ranging from .13 to . jo. Equally good results were obtained by 
Holcomb and Laslett (373). who lound a correlation of .48 between Mac¬ 
Quarrie scores and grades of 50 freshman engineers. The discrepancies 
are difficult to explain; however. Brush's numbers were greater and his 
criterion went bevond fust-year grades. 

Grades in dental schools were correlated with MacQuarrie scores in 
studies by Thompson (82]) and by Robinson and Bellows ((134). In the 
latter the correlations were .33 and .48 for two different groups of fresh¬ 
men, .40 for sophomores, hi the former, the correlation with freshman 
theorv grades was .03 (\ — 13S) and with practicum grades it was .ir. For 
seniors (\ — (>(>) the coefficients were .17 and .13. Correlations between 
part scores and criterion were no better for theory courses, but that be¬ 
tween manual dexteiitv subtest scores of seniors and practicum grades 
was .72 and that between spatial subtest scores and senior practicum 
grades was —.27. It is noteworthy that the same trend held for fieshman 
prac ticum grades (.22 and —.23), and that the correlations were reliable 
even though slightly lower, fust why the spatial parts of the test should 
be negatively correlated with laboratory grades is difficult to understand, 
although Thompson considers it logical, and the failure to confirm Rob¬ 
inson and Bellows’ results for grades in general is also a topic for further 
investigation. It is perhaps relevant that Sartain (669) obtained results 
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rather like Thompson’s for manual dexteritY and spatial parts of the 
MacQuarrie and average grades during the first six months of inn sing 
training, with the difference that both coefficients were positive (.26 and 
.36), as logical analysis of the test and of the tasks involved would lead 
one to expect, for spatial judgments are important in both theoretical 
and practical aspects of the sciences. 

The predictive value of the MacQuarric for clerical training in which 
manual dexterity might be considered important lias been ascertained 
in several studies. Using 12 j entering commercial high school gills as his 
subjects, 37 of whom graduated three years later, Kingman (p$l) found 
the lattei supciior on MacQuarrie Blocks and Tapping subtests, with 
somewhat bettei scores on othei subtests the dillerences for which writ* 
not cleatly significant. Gottsdanker (303) tested 51 women students in a 
business college, and used examinations in work with machine calcula¬ 
tors as his criterion of success. The three dexteritv tests had the following 
validities: 'Lapping .25, Dotting .21, and Tiacing .08. The validities arc 
such as might be expected from the nature of the tests. Barrett (pi) also 
worked with college age women, but hers were liberal aits students, <)(> 
of whom were studving tvping and 75 shoithand. Final grades were the* 
criterion. No correlation coefficients wcie computed, but instead the ef¬ 
fectiveness of the 1 tests in differentiating supcaioi {tom inferior students 
was ascertained. For typing the best subtests and their ethical scenes were 
the Tiacing 50; Dotting 22; and Pursuit 'Tests 22; lor shorthand, the 
Pursuit lest 2 j. It seems odd that the 'lapping Test was not also valid 
for typing, but it did not differentiate between good and poor typing 
students; the Tapping, Dotting. Gopving, and Blocks Tests also had some 
discriminating value for shorthand, but not sufficient to justilv using 
them in addition to the othei tests which had proved moic* usclul. On 
logical giounds, the Pursuit lest should have the most vaJichtv. for it 
seems to involve to a high degiee the* smooth-flowing and piecise co¬ 
ordination of hand and eve which is required in waiting shoithand. 

Success on the job, it is interesting to note, was not used as a criterion 
of the validity of the MacQu.mic Test foi Mechanical Aptitude until 
more than ten years alter its publication. Hanell administeieel it to loom 
fixers (33.]); then the United States Employment Seivicc use*d it in its 
studies of occupational ability patterns (770); subsequent studies have 
been published by Blum (104), Sartain ((ibej), and Goodman (298,300). 

In his study of loom fixers Harrell used 15 subjects employed in one 
Southern plant, with latings bv supervisors as the criterion of success. 
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Rack employee was Kited by three or four persons on a six-point scale 
for mechanical ability. The icliability of the i a tings was not ascertained, 
and no validity coefficient was published for this test. 

Sartain, as lias been seen in another context, worked with ,jf> aiicraft- 
lactoiy inspectors in a refresher training course', using ratings as a crite¬ 
rion. The correlation between MacQuarrie scores and ratings was .65, and 
was this high partly, no doubt, because of the gieater impoitancc of 
abstract abilities such as spatial visualization in training courses than in 
actual work. Ghiselli (i>SG) studied another group of 26 inspectors, but 
these were girls who inspected and packed pharmaceutical products: the 
study has already been described (p. 179). In this case the criterion was 
ratings of performance on the job, and the correlation with the Mac 
Ouarrie test was only .19, the lowest r obtained. 

Sewing-machine operators were tested by Blum (104), who selected 
the 25 highest-earning and 25 lowest-earning workers on piece work, 
using a combination of ratings and earnings as a criterion. The 'bracing 
Test was the best single subtest (not confirmed by Stead and Shuttle, as 
discussed below), bettet than am other and better than the total score 
A critical scoie of 40 was established lor this subtest, and would have 
eliminated 7b percent of the poor and 40 pet cent of the good operators 
when applied to this same group of workers. Failure to cross-validate ’teas 
a defect in this stuck, as there would certainly be some shrinkage in dis¬ 
criminating power. Although the percentage cited would, if it remained 
the same in future samples, improve selection appreciably, the critical 
scoie eliminated so main successful woikers that it could be applied onh 
in an cmplouTs matket. (It would have* eliminated about 57 and 70 pei- 
cent of two TSES samples, discussed below.) Other tests should be added 
in such a program, in order to cut down the percentages of false-positivcs 
and lalse-ncgutiws. 

In a recent thorough study Goodman (298) administered the Mac- 
Ouarrie to 329 women radio assembly operators immediately after thev 
were hired. Their age range was ifi to f)j, with a mean of 27, 3 j percent 
being under 19 \ears of age and only 15 percent over 50 years old. The 
job was described as follows in the job summary: “Assembles radio 
components such as tube sockets, transformers and capacitators on chassis 
to form a complete set; assembles teiinitial boards and other small as¬ 
semblies using hand tools: mounts subassemblies on chassis and secures 
them in place using nuts and holts or soldering iron and rosincore solder; 
icmoves insulation from wires using sandpaper and emerv cloth, and 
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tins stripped leads; may specialize in one phase of assembly details. 

The criterion was a rating of each new employee by the vestibule- 
training-school instructor alter the construction of three models; ratings 
were based on the amount of work done during a fixed period of time, 
and on qualitative factors such as excess or deficiency ol solder and loose¬ 
ness of joints. No check was made on the reliability of the ratings, per¬ 
haps because of operating problems, hut the distribution of ratings was 
found to he normal after proper statistical treatment. Validity coellicients 
for the part scores of the MacQuarrie are shown in Table an. 

Table 22 

CORRELATIONS BETWEEN THE MACQUARRIE 
TEST AND RATINGS OF ASSEMBLY WORK 

(n = 329) 


MacChiarrie Sub test 

r Ratings 

Tracing 

•32 

Tapping 

.18 

Dotting 

•C 3 

Copying 

• 3 * 

Location 

•3 r> 

Blocks 

•32 

Pursuit 

•27 

Total 

.42 


It will he noted that the validity of the total scoie is greater for this 
job than is that of any subtest, although this is not true of teitain other 
jobs or training courses. The reason for this is made clear by the lact 
that five of the subtests have modeiate \alidities: appaiently the work 
is of a type which requires manual, spatial, and perceptual aptitudes 
rather than just one of these abilities. It is because of its lapping of these 
three widely applicable aptitudes that the MacQuarrie has so often 
proved to have some validity, although other and better measures ol any 
one of these aptitudes usually prove more \alid when telex ant. It is woith 
noting that when the most effective combination of the subtests was 
made, the multiple R (all subtests) was ..}(>, onlv loin points higher than 
the zero-order correlation of the total score. 

Unlike most publishers of such studies, Goodman went further in 
order to ascertain the efficiency ol this test in employee selection. His R 
of .46, evaluated by means of the coefficient of alienation, shows that use 
of the MacQuarrie would improve the selection of radio assemble oper¬ 
ators in that plant by about 11; percent over and above what it would be 
without the test. The company then planned to applx the Iavlor-Rus- 
scll selection-ratio tables (812), selecting for employment only the top 
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30 percent of the distribution on the MacQuarric. It was estimated that 
the old method resulted in the selection of employees, 50 percent of 
whom weic satislaclory. With an li of .46, the selection ratio set at 30 
pci tent, the Taylor-Russell Tables indicated that 71 percent of those 
selected with the aid of the MacQuarrie should be satisfactory. At this 
point the wartime shortage ol personnel became so acute that every 
applicant seeking woik had to be hiied; it was still possible to make such 
a study in retrospect, however, using procedures which it had been 
planned to apply to luture employees. The results were reported in a 
third at tide (300). Of the original 329 employees, 193 or 38 percent had 
left the company; 33 of these were discharged, largely for “inability to 
do the work.” An attempt to establish critical scores was considered a 
lailure, but those who left ol their own accord made significantly better 
score's than those who were discharged (M —38, 30), and those who 
remained made intermediate scores which tended to be better than those 
ol the dischargees (M ~ j j. OR. se r.ji). If the Taylor-Russell ratio had 
been used, significantly fewer dischargees would ha\e been selected, but 
almost proportionate!) lewer long-tenure workers also would ha\e been 
accepted, lire test did not, therefore, contribute materially to selection. 

J he Division ol Occupational Analysis of the United States Employ¬ 
ment Service used the MacQuarrie lest in its test development work, 
including it in the research batteries for a variety of occupations accord¬ 
ing to hvpotheses suggested In analysis of the test and ol the job (Dvorak 
in 730. Ch. f>). The result was the finding that some of the subtests are 
valid lor clerical occupations as well as for some mechanical jobs, just 
as one* might expect in the case of tests of manual dexterity and of per¬ 
ceptual abilitv. A group ol 227 clerical workers were compared with 78 
manual workers (not otherwise described), and were found to equal or 
exceed O3 ol the latter group on the Tapping. Dotting, Copying. Loca¬ 
tion. and blocks subtexts. I he last three mav have been due to differences 
in mental abilitv, since' spatial v isuali/ation is an abstract lime lion, but 
the- lust two have been seen to be primarilv dexterity tests. Yaliditv 
c oellic ieuits loi the occupations concerned are presented in Table 23; data 
on occupational differences are discussed subsequently. 

Outstanding in this table are three facts: the validity of some of the 
subtests foi occupations in both clerical and manual fields, the unreli¬ 
ability ol even some 1 high correlation coefficients when checked on an¬ 
other’ sample of workers in the same job, and the different validities of 
tests saturated with identical factors. Illustrative of the former point is 
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Table 23 


VALIDITY COEFFICIENTS AND CRITERIA FOR THE MACQUARRIE SUBTEXTS 

(After Stead and Shartlc) 


Occupation 

N 

Criterion 

r Subtest 




Clerical 

Occupations 


1 11 

III IV 

V 

VI 

VII 

Card-Punch-Machine 







Op. 

Card-Punch-Machine 

121 

Output .if> .12 

.27 .05 

•03 

- °.3 

■ 1 7 

Op. 

113 

Output .05 .25 

.19 .24 

.07 

-.04 

.10 

Index Clerk 

5> 2 

Error ratio — .09 - .29 

— .03 —.08 - 

-.14 

.07 

- .2 r ) 

Toll-Bill Clerk 

19 

Output —. ro — 27 

-.24 -.04 

.02 

.07 

-.28 

Calculator Operator 
Adding-Machine 

80 

Worksarnple . r; 

.07 .38 

33 

■38 

■43 

Operator- 

2fi 

Worksample 





Manual 

Occupations 







Pull-Socket Assembler 

16 

Output —.01 

.22 .30 

.02 

-.14 

.14 

Put-in-Coil Girl 
Power-Scwing- 

18 

% effic. * — .29 

— .28 —.24 — 

.40 

— .06 

-.09 

Machinc Operator 
Power-Scwing- 

46 

% cffic. .27 





Machine Operator 

2 3 

Unknown .17 .12 

• Of, .20 

■ 1 r > 

-.tf 


Lamp-Shade Sewer 

19 

Output . r6 — .2") 

— .19 — .08 

.77 

.77 


Merchandise Packer 

3 ° 

°/ ( eflic. —.18 .28 

7/ —.01 

.2 1 

•1 r > 


Can Packer 

43 

Output .18 .24 

7.3 - 2 ° 

.09 - 

-.11 


* Ratio of time set bv time studv to complete work to actual time required 

bv w 

orkei 


in question to complete work. 


the Tracing Test, moderately \alid for calculating and adding-macliine 
operators and also for pull-sockct assemblers, and the Location Lest, 
which has positive validity lor the two business-machine opeiator gtoups 
and for lamp-shade sewers but negative validity for put-in-coil girls. Illus¬ 
trative of the fluctuation of validity coefficients when the samples arc 1 
small are the correlations of .51 and .10 for two gioups of powei-sewing- 
machine operators, difference which might, however, be due to diffeiences 
in the criteria, one of which is not specified. The third lact is illustrated 
by the validity of the first dexterity test ( l racing) for three occupations 
and the doubtful validity of the second test of manual dexterity for any ol 
the fields in question, and also by the validity of the first spatial test 
(Copying) but not of the second (Location) for pull-socket assemblers. 

Despite these discrepancies, inspection of the table suggests that there 
is a tendency for the Tapping and Dotting Tests, and for the Copying 
and Location Tests, to agree. The dexterity tests tend to have some 
validity for various types of office-machine operators and for packers, both 
of which agree reasonably well with logical analysis of the tasks; the 
latter, or spatial tests, tend to have some validity foi office-machine oper- 
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;itors and for machine and hand-sewers. It is unfortunate, since the test 
was designed as a test of mechanical aptitude, that no mechanical occupa¬ 
tions were included; the cliffeiential validities lor other types of 
occupations are helpful as indicators of possibly worthwhile groups on 
which to try the test for selection purposes; they are not, however, clear- 
cut enough to provide very helpful data in counseling. This fact will be 
brought out especially by the data on occupational differences, for some 
of the* high-scoring occupations ate those for which low validities were 
repotted, and some ol those for which the subtests base moderately high 
validities ate fields, the mean scores of which are relatively low—an 
apparent paiadox which will be discussed in subsequent paragraphs. 

Occupational differences have apparently not been studied as such bv 
means ol the MacQuarrie Test for Mechanical Ability, but data on differ¬ 
ences between a few jobs have been reported in Stead and Shartle ( 75 ° : " 
l» pp,) in the form of graphs which show the approximate means and 
standard deviations. The groups ol workers making high scores on 
the manual dexterity subtests include index clerks, put-in-coil girls 
card punch-machine operators, and toll bill clerks: power-sewing-machine 
operators, can packers, and adding and calculating-machine operators 
tend to make low scores on one or more dexterity tests. On the spatial 
tests those* tending to rrrake high scores were card-punch-machine opera¬ 
tors, index clerks, and toll-bill clerks, although the can packers included 
many high scorers on the one three-dimensional subtest (Blocks), as did 
also the merchandise packers. Low scores on the spatial tests were most 
h equentlv made* bv power-sewing-machine operators. I he Pursuit Test, 
which is both perceptual and spatial, is one on which card-punch-ma¬ 
chine operators and electrical-assembly workers tend to make high scores, 
the power-sewing-machine and adding and calculating-machine operators 
being low. 

It is interesting to note that the data on occupational differerrees do 
not alwavs agree with those on the correlation between scores on these 
tests and output. For example, the correlation between Location Test 
scores and card-punch-machine operation has been seen to be .03 and .07 
for two samples, while* in contrast with this negligible relationshrp we 
have also seen that card-punch-machine operators make higher scores on 
the Location Test than most of the other groups of workers tested. At 
first this seems inconsistent, but on second thought it is not illogical for 
a job to require a lairlv high degree of a given aptitude, natural selection 
discouraging or' eliminating those who lack it, and vet not to be so 
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dependent on it that those who possess it in a high device excel in the 
work. XVc have aheadv seen this in connection with intclligence tests, 
the data showing that in many occupations the wotkers must have mote 
than a critical minimum of mental ability and that additional increments 
do not affect success, other factors then becoming much mote important. 
So, apparently, it is in the case of other aptitude tests. This means that 
evaluations of the effectiveness of tests in personnel selection and guid¬ 
ance should not be based on correlation coefficients alone. 

It is also true that in some low-scoring groups the con elation with 
success is moderately high. Can packers, lor example, made a 1 datively 
low mean score on the Dotting l est, but the correlation with output in 
their case was found to be moderately high (epj). To make* this point in 
another wav, a high correlation between test scores and success does not 
riecessarib mean a high critical minimum for emplownent; and a high 
critical minimum lot employment does not nccessaiily mean that a high 
correlation will be found between the scoies of unsclectcd zemkc). s and 
success, although it would mean a substantial conelation between test 
scores and success in an unselected group of aj)j)lu ants lor wot k. 

Job satisfaction , in the case of the MacQuanie as in that of most other 
tests, has not been used as a criterion of success. 

Use of the Mai()uarne I'cst of Mec Ikdik <il Ability in Counseling and 
Selection. The evidence reviewed in the preceding page's makes it clear 
that the MacQuanie Test of Mechanical Ability measmes three- differ cut 
aptitudes: manual dexterity, spatial \isuali/ation, and perceptual speed 
and accuracy. Although some of the subtests appear to be rc l.iti\cl\ pine 
measures of one single factor (the- Copving, Location, and filoc k I ests 
measure spatial visualization and l apping measure’s dexteritv), others 
are measures of combinations of factors (Tracing and Dotting arc 1 man¬ 
ual-perceptual, and Pursuit is perceptual-manual). This being the case, 
it was not surprising that the educational and occupational significance 
of the test was sometimes obscured by the use of total scores, and the 
significance of the subtests was found to vary with the occupation. 

The effects of maturation on the MacQuanie Test appeal to be an 
increase in scores during adolescence, followed by the decrease with later 
adulthood which is usually found in scores on tests in which speed is a 
factor. Although these tendencies have been made sufficiently clear to be 
considered in counseling, they have not been studied in great enough 
detail to make possible the establishment of special norms for use in 
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counseling either in early adolescents or older adults in terms of status 
in comparison with adult, workers. In such cases it is possible only to use 
age norms (adolescents) or general adult norms (making allowances for 
age on a rule-ol-thumb basis). 

()(cupations for w hic h the test has validity include business-machine 
operators (calculating machines, adding machines, card-punch machines, 
etc.), small-assembl\ workers (radios, electrical pull-sockets, etc.), and 
packers (merchandise and cans), although some subtests are valid for 
some and not for others ol these*. Superior aircrait factory inspectors tend 
to make 1 higher total scores than less successful inspectors, and efficient 
1 adio-assernbly operators surpass less efficient operators on the total score, 
because hot Jr jobs seem to require tire combination of aptitudes repre¬ 
sented f>\ high total scores. On the other hand, good can packers excel 
on some* of the manual dexterity and on the three dimensional tests, but 
not on others, and good power-sewing-machine operators make higher 
scores on the* Blocks Test than do inferior operators wdiilc the Copying 
'Lest has little* validity for this group. 

S( I tool and (nllegc use of the MacQuarric can be varied. The test is 
uselul in counseling students concerning the choice of trade, technical, 
and dental curricula, although its \aliclity is not as great as some studies 
suggest and part-scores should be used in fields such as dentistry, with 
recognition of the lact that other factors arc of considerably more im¬ 
portance than those assessed by the MacQuarric. The dexterity and 
pursuit subtest scores also ha\e bearing on success in training in tv ping 
and shoithand. Because ol the specificity of its part-scores, the MacQuar- 
r ie is likely to be more valuable in selection lor training than in counse¬ 
ling concerning fields ol endeavor. 

In oiudainr (ruins and employment sendees this test can be useful in 
counseling clients concerning training in the fields just listed, and in 
screening cmplovment applicants who are most likely to prove successful 
in olhce-mac hine operation and assembly jobs. 

In business and indushy the MacQuarric can be a useful screen for the 
selection of the business-machine operators and assembly workers who 
have the manual dexterities and spatial aptitude which make for success 
in such work. Because of the specific factors measured by the test and the 
great variations in the psvchological requirements of machine-operation 
and assemble jobs it is important that local validities and cut-oil scores 
be established for each subtest, rather than depending on data from other 
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studies. As Stead and Shartles data have shown, subtest validities some 
times vary even from one sample to another, as when the Pursuit Tesi 
yielded a validity of .51 for one sample of power-sewing-machine opera 
tors and .01 for another. 

The Purdue Mechanical Adaptability Test (Div. of Applied Psychology, 
Purdue University, 1946) 

The Purdue Mechanical Adaptability Test was published in 1946 as a 
result of work designed to produce a briel test which could be used by 
industrial personnel workers to measinc '“knack ior mechanical, electii- 
cal, and related activities. It was assumed that the most effective way to 
do this was to measure the amount of information acquired concerning 
mechanical, electrical, carpentry, plumbing, and related tools, materials, 
and processes. The test is, therefore, very similar in approach and content 
to the O’Rourke Mechanical Aptitude Test, previously discussed in some 
detail. It differs in that it uses only verbal rather than botli giaphic and 
verbal items, and in that it is much briefer, consisting of only bo items. 
Although the only published study of the test in print at the time of 
writing is the original one by the tests authors (jfyj), the instrument is 
briefly treated because it seems likely to become a widely used and valu¬ 
able instrument. 

Description. The 60 items in the test ate divided as follows: wood¬ 
work and finishing, to items; automotive, 17; electricity and radio, 18; 
machine shop, 4; plumbing, 4; shectmetal, 2; miscellaneous, 5. 1 best 
items were selected from 400 which were written to tap fust-hand contact 
rather than principles, and to utilize 8th grade \ocabulury except lot 
technical terms. The 100 best items were selected on the* basis ol lack oi 
relationship to an intelligence test and internal consistency and ti ted 
out on 439 high school and college students, revised on the basis ol then 
answers and ciilicisms, and administer eel to gb.j men applying lor steel 
mill jobs and to 98 men employed in foundries and metal pioduc ts manu¬ 
facturing concerns. Again lack of relationship to intelligence lest items 
and internal consistency 7 were the criteria for evaluating items, the bo 
item Form A for Men being the result. The weighting of the dilietent 
fields of “mechanical” work was therefore based not on judgment of the 
appropriate representation of the types ol activity in which boys and men 
engage, but on the proved usefulness of var ious types of items in consist¬ 
ently measuring familiarity with tools, materials, and procedures in a 
variety of fields in which men and boys are customarily active. The result 
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is an empirical rather than an a priori weighting which takes into account 
the very factors which a priori judgment might have considered. 

The test takes 15 minutes to administer, and scoring is simply a matter 
of counting the correct responses, doubling this sum, and adding the sum 
of the “don't knows." Norms given in the manual are for G67 industrial 
applicants, not described. The article by Lawslie, Semanek, and Tiffin 
( 151 ) provides norms for 1015 “industrial" men, 103 non-engineering 
college men. 5 j engineers in non-mechanical fields, and 71 mechanical 
engineers. The latter groups are sufficiently well described for the norms 
to be of some value, despite the small numbers: almost all were under¬ 
classmen at Purdue, the non-engineers being majors in science, pharmacy, 
and physical education. The industrial group is not described, although 
it presumably includes groups mentioned in the paper, namely industrial 
applicants in a steel mill and industrial applicants in an optical manu¬ 
facturing plant. These groups are not, however, well enough defined as 
to intellectual or trade Ie\el for one to be able to use them as general 
nouns: they may. for example, have been applicants for skilled jobs, or 
applicants foi unskilled employment, or, more likely, an unknown mix¬ 
ture of applicants for unskilled, semiskilled, and skilled jobs. 

T he reliability of the test, determined by the odd-even method and 
corrected by the Speannan-Biown formula, was found to be .84 and .80 
with groups of industrial and college men (454). This is not as high as is 
desirable and possible in aptitude and achievement tests, although it is 
not too low for use; lengthening the test to 80 items with a 20-minute 
time limit might well pro\c worth while. 

Validity of the test has, even in the brief period since its de\elopment, 
been checked in a vai iety of ways. The correlation with intelligence tests 
was demonstrated to be low by coe fficients of .32 (187 industrial employ¬ 
ment applicants) and .17 (173 college men) with the Purdue Adaptability 
lest. AVhcn con elated with the Otis S.A. scores of 25 mechanics, presum¬ 
ably a somewhat homogeneous gioup like the college students, the 
coefficient was .08. Although its correlation with the California Capacity. 
Non-Language, Lest was .41, that with the Language Test was only .12 
(40 apprentices). Correlation with the Bennett Mechanical Comprehen¬ 
sion and Minnesota Paper Form Board Tests were .71 and .18 for some 
30 unidentified subjects, which suggests that, as one would expect, the 
Purdue Test measures the informational component of mechanical com¬ 
prehension rather well but docs not tap spatial visualization to any great 
extent. These findings need, however, to be confirmed by other studies 
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with well described samples before they can be considered conclusive. 

The authors ( 1 51) report no relationships with grades, as yet, their 
interest having been primarily in the industrial use of the test. Correla¬ 
tions with occupational criteria are mostly tank-order coefficients based 
on very small groups, and so can he considered only as preliminary indica¬ 
tions of the test’s possible significance. If these data are followed by more 
comprehensive validation studies, as the sponsorship ol the test suggests 
it will be, this is still a good deal more evidence in favor of the test than 
is presented in most first editions of test manuals. A group of 1 j experi¬ 
enced mechanics in an ice company weie rated by tlieii supervisors. The 
scatterdiagram showing ratings and scores is long and nanow, suggesting 
a rather high correlation (-Hi) and a rather eflective cut-oil score ol 90. 
Six time-study men in a musical instrument factory were ranked by their 
supervisor, the rank-order correlation with the Mechanical Adaptability 
Test being .75 dt .18. Twelve steel mill appientices were* tested at the 
time of hiring and ranked by their supervisor alter they had been on 
the job, the rank-order correlation being .39 ± .23. Data lot several other 
groups are reported, but as they were used in standardizing the test they 
are not meaningful. 

Although no data on occupational differences are as yet available, the 
authors report ditlerences between pre-occupational college groups which 
are rather infoimative. T he mean scoies lor 71 mechanical and aeronau¬ 
tical engineering students were 103, civil, metallurgical, and electrical 
engineering students c)f>, and science, pharmacy, and physical education 
majors 92. The critical latios between these' groups weie 3.8 (mechanical 
vs. non-mechanical engineers), 6.3 (mechanical engineers \s. non-engi¬ 
neers), and 2.1 (non-mechanical engineers vs. non-engineci s). These sig¬ 
nificant differences suggest that this test is indeed a mechanical rather 
than a scientific, or even physical science, information test, and that it 
should be most useful in the counseling and selection of persons consider¬ 
ing mechanical woik. 

As more studies are made it will be helpful to have compatisons ol this 
test’s effectiveness with that of the O'Rouike, as the most neatly similai 
test available, and with that of the Bennett, as one which differs bom 
this in that it attempts to measure comprehension ol ptinciples rather 
than familiarity with tools and processes. Mote detailed and specific in¬ 
dustrial norms will be helpful in counseling, although in selection local 
norms must always be developed. And validation studies based on larger 
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groups with refined criteria of success are needed in order that the occupa¬ 
tional significance of the test may be known. In view of the simplicity ol 
the vocabulary, educational validation for trade and technical courses 
should not be neglected. As such evidence is forthcoming the Purdue 
Mechanical Adaptability Test will probably become a widely used and 
useful diagnostic and prognostic instrument. 



CHAPEER XI 

SPATIAL VISUALIZATION 


THE ability to judge the relations of objects in space, to judge shapes 
and sizes, to manipulate them mentally, to visuali/c the cllects ol putting 
them together or of turning them o\er 01 around, is generally icier red 
to as spatial visualization. It is an aptitude which has long been consid¬ 
ered important in such clearly similar activities as machine-shop work, 
carpentry, and mechanical drawing, in which the worker must judge 
shape and size and translate two-dimensional drawings into thier-dimen- 
sional objects, and which has been considered liheh to be important in 
certain other occupations, the principal activities ol which were not quite 
so clearly similar, such as engineering and art. 

W ork in the measurement ol spatial judgment began, however, as one 
aspect of the measurement of intelligence rather than as an attempt to 
measure a special ability ol significance in ceitain occupations. Clinical 
psychologists, attempting to devise non-verbal or performance tests of 
intelligence which would be usclul in appraising the mental ability of 
persons with limited formal education or whose linguistic development 
might in some other way have been handicapped, resorted to the familiar 
pu//le-t\pe test in which the subject is required to put objects together in 
such a way as to make a pre-determined pattern. Sometimes the pieces to 
be assembled were parts of a picture, as in the Mate-and-Foal Test used 
in the Pintner-Paterson Scale of Performance Tests; in such cases the 
cues relied upon by the examinee arc parth spatial (the shape of the 
curved outlines of the parts) and partly experiential (e.g., the head must 
fit at the end erf the neck). In other tests experiential content was not 
utilized, as in the case of the Casuist Board, in which geometric figures 
are put together to form large wholes, also geometric. In such tests, the 
removal of cues based upon and requiring the analysis of experience was 
part of an effort to make the test truly a measure of mental ability rather 
than one of education. As subsequent w T ork showed, it resulted in the 
measurement erf a trail which is related to mental ability in childhood 
but relatively independent in adulthood. 
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When large-scale testing operations made it desirable to develop group 
tests of the pei lot mance type, Army psychologists in World War 1 pro¬ 
duced Army Beta, a paper-and-pencil version oi a performance scale. I he 
subtests, like those oi the apparatus tests, involved completing incomplete 
figures of people and other familiar items in which analysis of content 
could help the examinee, and judging the relations of geometiic figures, 
in which it was hoped that abstract reasoning alone would play a pait. 
As such paper-and-pencil tests of spatial judgment were made available 
loi adult use, form boards were also developed loi use with normal adults, 
lank developed a Korin Board which subsequently developed into the 
Minnesota Spatial Relations I est, and Kent and .Shakow de\ised a set it s 
of loim Boaids which has models lot clinical use with mental patients 
and lor industrial use with normal adults. 

Because of the emphasis on the measurement of the intelligence of 
special groups which pervaded early work with tests of spatial relations, 
and the subsequent application of such tests to industrial use, students 
of testing are often confused by what seems to be a serious inconsistency 
m the use of tests by psychologists. Tliev find tests of spatial judgment 
figuring prominentIv in intelligence tests such as Army Beta, the Army 
(>encral Classification Test, and the American Council on Education 
Psychological Examination, and also masquei ading as tests ot a special 
aptitude as in the case* of the Minnesota Spatial Relations Test, the Min¬ 
nesota Paper Eorm Boaid. the Kent-Shakow Form Boards, and the Blocks 
lest ol the MacQuairie l est ol Mechanical Ability, The question arises, 
is it possible that the same tvpe erf item can measure both intelligence 
and a special aptitude not related to intelligence? 

The theoretical explanation of what actually seemed to be the case was 
slow in coining, because of the di\ergent interests and practical concerns 
ol both clinical and personnel psychologists. But it was implicit in data 
familiar to most psuhologists, lor it had long been known that perform¬ 
ance tests of intelligence (i.c\, lorm boards, tests heavily saturated with 
spatial visualization) did not correlate well with other tests of mental 
ability and ga\e pool predictions of school achievement, increasingly so 
with increasing age. This suggested that spatial judgment might be a 
special aptitude which develops at approximately the same rate as other 
mental abilities, and thciefore provides a fair measure of mental age in 
childhood, but that, being a special aptitude, the degree of spatial 
judgment possessed in middle adolescence or adulthood is not a good 
indicator of the amount of any other mental ability possessed by the 
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individual. This has since been confirmed by Garrett (281) in an analysis 
of test data tor the appiopriate years, and, in another way, by Thur- 
stone’s work (839), in which it was demonstrated that what is thought ol 
as intelligence is, in tact, a number of special aptitudes. In this analysis 
spatial visualization emerges as one special aptitude, distinct irorn the 
verbal, numerical, perceptual, memory, and other aptitudes which are 
relatively independent of each other in homogeneous gioups but tend to 
be associated in heterogeneous groups. A spatial lelations test is therefore 
effective in classifying people according to “general” ability when wide 
ranges of ability arc in question, and so has a part in a test such as the 
AGCT; on the other hand, when a group ol failly similar general ability 
is being studied, whether they be factory workeis 01 college students, 
scores on tests of spatial relations arc found to be related to success in 
certain types of activity without being good predictors of success in 
others. We have already seen, for example, that verbal scores on the 
A.C.E. Psychological Examination give equally good predictions ol suc¬ 
cess in social studies and in mathematics, whereas quantitative scores, 
which are partly based on spatial items, give substantially better predic¬ 
tions of success in mathematics than in social studies. 

Of the tests which have been developed for the measurement of spatial 
\ isualization, the most widely used in vocational counseling and selection 
have for some years been the Minnesota Spatial Relations Test and the 
Likert-Quasha Revision of the Minnesota Paper Fonn Hoard. These will 
be discussed in this chapter; it will be seen that the tests are impure, 
for they measure certain other factors to a lesser degree. In addition 
to these special tests of spatial judgment the user ol tests should keep 
in mind the spatial subtests of composite tests or test batteries such as 
the Blochs Test of the MacQuarrie Test of Mechancial Ability, the 
Surface Development Test of the Chicago Tests of Primary Mental 
Abilities, and the Space Relations Test of the Psychological Corpora¬ 
tion’s Differential Aptitude Tests, all of which are discussed elsewhere 
in this book. 

Another very well-known test of spatial visualization is Johnson 
O’Connor’s JViggly Blochs (122,3.11,4 16,(ish), the widespread use of which 
would justify discussion in this chapter if it were not so unreliable 
as to make it useless. Mellenbruch (523) developed a series of similar 
blocks at about the same time but did little with them, and Uhlaner 
(unpublished study) has recently developed a reliable series of curved 
blocks which may in time prove useful: further research with UhlaneTs 




SPATIAL VISUALIZATION 2 H 5 

scries should be encouraged and will bear watching. Crawford (933) also 
has a test which, like the three wiggly blocks tests, attempts to measure 
spatial visualization with three-dimensional material, but this test also 
is new and theie is as yet too little evidence to judge it by. In view of the 
fact that judgment based on two-dimensional materials such as those of 
the Minnesota, Thurstonc, and Psychological Corporation tests may not 
be identical with judgment of space based on three-dimensional mate¬ 
rials (theie is only one study to suggest that it is), it is to be hoped that 
further theoietical and occupational research will be conducted with the 
more reliable of these tests. 

The Minnesota Spatial Relations Test (Marietta Apparatus Co. and 
Educational lest Bureau, 1930) 

The Minnesota Spatial Relations Test was developed by the Mechani¬ 
cal Abilities Reseat ch Project of the University of Minnesota because ol 
the promise ol the Link Form Board (5N8). The latter test had a reli¬ 
ability ol only .71* as determined in the preliminary woik ol the project; 
by using lour boatds instead ol one, the new test achieved satisfactory 
reliability. It has since been used in the Minnesota Employment Stabili¬ 
zation Research Institute which added \aluable normative material, and 
in several other studies to be discussed below, but the administrative 
expense of apparatus tests and the lact that it has a rather good paper- 
and-pencil equivalent have kept it from being as widely used and studied 
as some other tests. It is discussed here because it is a purer test ol spatial 
judgment than the paper-ancl-pencil tests, as will be seen later, and there¬ 
fore contributes materially to our understanding of the trait and has 
special value in testing for the less abstract or academic: types of technical 
training and employment. 

Applicability. Like the other tests of the Minnesota Mechanical Abil¬ 
ities Project, the Spatial Relations Test was first used with junior high 
school boys taking trade courses, but was designed with the objective of 
making it usable with older adolescents and adults. Use of the test with 
boys as young as 11 years old and with adults of all ages has confirmed 
the belief that the nature of the task is such as to make it applicable to 
a wide range of ability, spatial judgment beginning to mature early 
enough for the items to be meaningful even before adolescence. As the 
aptitude is still maturing during adolescence age norms are of course 
needed, and here as elsewhere a problem is encountered in the vocational 
counseling of adolescents. If one uses age norms in interpreting the test 
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scores of high school students, one runs the risk of encouraging a student 
who is superior to his class or age group in spatial judgment to enter an 
occupation for which he may actually he lacking in the aptitude in ques¬ 
tion, because those who enter the occupation may be so highly selected 
in that respect that he is actually at the bottom of the occupational dis¬ 
tribution even though near the top in the general norms for his age. Age 
norms are available, as are some occupational data, but developmental 
conversion tables arc lacking which would enable a counselor of high 
school students to determine how able a given boy or girl will be when 
adult to compete with persons engaging in various occupations. Judging 
by the age norms, spatial visualization increases until age 14 and remains 
constant at ages 1 5, 16, and 17; there is a suggestion of an increase at 
age 18, the mean score for which is somewhat higher than that for the 
three preceding years, but the difference is not great and may be due to 
elimination of some of the less able in the older sample: the 80th and 
90th percentiles are about the same for ages 15 through 18, which fits 
in with the explanation of the difference on the basis of sampling. 

Content . T he Minnesota Spatial Relations lest is made up of four 
form boards, ol which A and B use the same pieces and C and F) have 
common pai ts. The arrangement of the parts differs, however, in the two 
members of each pair, so that having placed them in Boaid A presumably 
helps one in doing Board B only by orienting one to the task and mate¬ 
rials: it does not teach one where the parts go. The parts themselves are 
tut from a rectangular board about three feet long by one wide: there 
are three pieces of each shape, but of varying sizes, ari.mged close to¬ 
gether but not adjacent to each other in the board. The shapes include 
crescents, squares, angles, and odd-shaped geometrical forms. 

Administration and Storing. The test is administered individualh 
and requires from 15 to 45 minutes, the average* adult finishing all fom 
boards in 20 or 25 minutes. Although it is not stated in am of the pub¬ 
lications or manuals dealing with the development or administration of 
the test, the subject stands while taking the test. Failure to include this 
simple but basic detail in the manuals has resulted in the test being 
administered with the client seated, in some guidance centers, and stand¬ 
ing in others, while some known to the write) have let the subject decide 
which way to do it. At one of the latter places the staff reported that sub¬ 
jects concluded it was more easily done standing; despite this fact, no one 
took the trouble to write to the test author and ask how it was adminis¬ 
tered to the subjects on whom the norms are based! 
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A letter from Professor Paterson, dated August i.j, 19.^G, states: “It is 
the rule to have subjec ts stand. This isn’t the worst pai t ol the story. 
Manufacturers have substituted different kinds of materials at will with 
the result that the norms may not apply/’ In order to ascertain the pos¬ 
sible effect of these different methods of administering the test, the writer 
and a colleague (Charles N. Morris) conducted an experiment in which 
the test was administered to groups with the subjects sitting while 
taking all four boaids, standing lor all four boards, sitting for the first 
two but standing for the last two, and standing for the first two but 
sitting for the last two. Although comparison of the mean gains on the 
second two boaids o\er performance on the fust two failed to demonstrate 
clearly that higher scores are made if the lest is taken standing up, there 
was a tendency for those who took the test standing to do somewhat 
better than those who w r erc seated. In administering the test sitting 
down, then, psychometrists may be penalizing their examinees and ob¬ 
taining an inadequate picture of their spatial judgment. 

In view’ of the diversity of materials which, as Paterson points out, ha\c 
been used in manufac tining the test, experiments should be conducted 
which check up on the effects of the variations on the suitability of the 
test norms. The point has already been made concerning the two diHer¬ 
ein forms of the Minnesota Manual Dexteiity 01 Rate ol Manipulation 
Pest. In connection with the spatial test, the point might be made that 
wooden equipment ma\ fit less readily than metal, and that different 
rates of wearing render tests made of one type of material unusable 11101c 
quickly than another. One manufacturer paints boaid and inseits black, 
as in the original wooden materials, another presided green-topped 
wooden inserts for black metal boards. In the Arms Air Voices Aviation 
Psychology Program it was found that frequently used equipment soon 
wore so badly that the nature of the task was considerably changed. The 
f01 m boaids used in the experiment referred to in the preceding para¬ 
graph were not only somewhat worn, which made some pieces fit more 
easily, but somewhat waiped, which made others fit less easily than thev 
had originally. The effect of this on test scores and the suitability of the 
norms has not been checked. 

Apart from these questions of the examinee’s position and the nature 
of the test materials, administration of the test is straightforward. Sail¬ 
ing. in the original woik of the mechanical ability pioject, invohccl 
obtaining the total number of seconds required to complete all lour 
boards; the norms for bovs are based on this procedure. 'This is the 
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method described in the manual published by the Educational Test 
Bureau, publisher of the green-topped inserts and black-pain ted boards. 
The Minnesota Employment Stabilization Research Institute experi¬ 
mented with methods of scoring the test and found, however, that its 
reliability was increased by Heating the first board as a practice trial and 
scoring only Boards B, C, and I) (187). The general adult and occupa¬ 
tional norms obtained in the MESRI work were therefore published in 
terms of the last three boards (306) and this is the recommended method 
of scoring. 

Norms. The boys’ norms provided by the Mechanical Ability Project 
cover ages 11 through 18, in glades 7 to 12, the numbers in any gi\en 
gioup ranging from 55 to 150. All of them were boys in Minneapolis and 
St. Paul schools, and while they may have been a good local sample 
there are no data to enable one to judge the applicability of these nouns 
in other localities. Norms are also available for 57 arts and 201 engineer¬ 
ing college students at the University of Minnesota, all heshmen. These 
norms are based on time for all lour trials; if it is desired to use them, 
time lor Board A should be recorded and included in the total. The 
Board A score will not be used, however, if the norms compiled at the 
MESRI are utili/eel. These are based on the now familial standard 
sample of 500 employed men and women, and on various occupational 
groups of from 20 to j89 persons each. They are available in abbreviated 
form in the Educational Lest Bureau Manual, recomputed lor all lour 
boards, but this is an inferior method. In view of the paucity of data 
about the norm groups in the Educational Test Bui can manual, the 
inferiority of its scoring system, and the relative unavailability of the* 
Minnesota bulletins in which the better type of norms are published, 
general adult norms aie provided in Table 2.j and occupational median 
scores, also from the MESRI, are provided in 'Table 25. 

Standardization and Initial Validation. When existing tests weie be¬ 
ing surveyed fot possible use in the research ol the Minnesota Mechanical 
Ability Project, Link’s Form Board seemed one of the most promising. 
Included in the preliminary research, it proved to have less reliability 
than that needed lor its scores to be usable in individual diagnosis. It 
was therefore lengthened by making a total of four boards with the same 
type of items, and a satisfactory reliability was obtained. 

Like the other tests in the Mechanical Ability Project, the Spatial 
Relations 'Test was subjected to rather thorough study and validated 
against success in mechanical activities. It was louncl to have a low cor- 
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relation with intelligence as measured by the Otis (r = .18); the group 
was a fairly homogeneous one ol roo 7th and 8th grade boys. It had a 
rather high correlation with the Minnesota Mechanical Assembly Test, 
based on the same group (1 — .56) and with the Stcnquist Picture Test 
(r = When correlated with a mechanical interest inventory a rela¬ 
tionship was again lourrd, r being Scores were not, however, related 
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to the father’s occupation, the household chores engaged in by the boys, 
and similar cm ironrnental data. 

Validation in this early stage was done against ratings of the quality 
of shop work done* by the boy: the work was a standard task carefully 
rated by the instructor. The group were the same roo 7th arrd 8th 
graders. The correlation of .53 showed that this was one of the most valid 
tests irr the Minnesota battery lor tire prediction ol success in mechanical 
activities. 
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Reliability. Using all four trials in computing the score, the original 
study of the Minnesota Spatial Relations Test yielded a reliability of 
.8.] based upon scores of too 7U1 and 8th grade boys (corrected for 
attenuation). When the last three boards only were counted, with 
Board A serving as a practice trial, the reliability for 482 adult men in a 
selected sample of the employed population was .91 (187). 

Validity. Criteria used in studying the validity of the Minnesota 
Spatial Relations Test include the usual variety of tests of other abilities, 
grades in school and college courses, ratings ol woi k samples, and differ¬ 
entiation between persons in various occupations. Ability of the test to 
yield predictions of success in employment has not been studied, perhaps 

Table 25 

MEDIAN SCORES IOR VARIOUS OCCUPATIONAL GROUPS 

Median 


Number 

Group 

Percentile 

102 

Garage mechanics . 


i 70 

Manual training teachers. . . 


62 

Ornamental iron workers . . 


113 

Men office c lerks . . 


20 

Draftsmen . . . 


29 

Minor hank officials 


84 

Retail salesmen . 


47 

Life insurance salesmen 


480 

Occupationally unselected men. . . 


26 

Minor executives. 


89 

Janitors . . 


124 

Policemen. . . ... 

27 

33 

Casual laborers 

2 


because ol the difhculty of administering an apparatus test to large 
numbers of employment applicants and the availability of a paper-and- 
pencil \cision of the same test (the Rexiscd Minnesota Paper Form 
Board, discussed in the next section). 

We have seen that the original wotk with the spatial relations test 
yielded a correlation of .18 between spatial scores and stores on the Otis 
Self-Administering Test of Mental Ability. In an unpublished study of 
100 NYA youths aged 16 to 2f the writer obtained a correlation of .25 
between the same two variables. Andrew (21) correlated spatial relations 
test scores with scores on the Pressey intelligence tests, finding r\s of 
.43 and .36 for groups of 334 unselected men and 131 unselected women 
in the MESRI project and an r of .25 based on 200 women clerical 
workers. The higher coefficients were obtained with more heterogeneous 
groups such as unselected adults, and the lower figuies with Jess heteroge- 
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neons groups such as 7th and 8th grade boys; it seems legitimate to con¬ 
clude that in homogeneous groups there arc variations in ability to 
visualize spatial relations which are quite independent of genetal mental 
ability, and that in heterogeneous groups the relationship between the 
two is positive but not high enough to make one useful by itself as a 
predictor of the other. 

Manual dexterity has been studied in relation to spatial judgment hv 
Andrew (21) and by the writer in an unpublished study. The formei 
investigator correlated scenes on the Minnesota spatial test with scores 
on the O Connoi linger and Tweezer Dexterity Tests; her subjects weie 
200 women clerical woikeis. The correlations of .28 and .31 showed that 
the two types ol aptitude o\eilap slightly, but are virtually independent. 
In the writer’s study of 100 NYA youths aged if> to 2{ the Minnesota 
Manual I)exterit\ Test (Placing) yielded a correlation of .05 with the 
spatial test, confirming the findings of the original unpublished study of 
the placing test which was developed in order to ascertain the role of 
manual dexterity in the Minnesota Spatial Relations Test. The conclu¬ 
sion that difleicnees in manual dexterity do not affect scores on the 
spatial test theieloie seems wan anted. 

Mc( hann al (onijnehensum was seen in the last chapter to be composed 
of spatial judgment and mechanical information. T he correlation* be¬ 
tween scores on spatial \ isuali/ation tests and tests of mechanical com- 
piehension weie reviewed and discussed in some detail, and are therefore 
not lepeatecl here. 

Spatial i'lsualr.atinn has been measured by other instruments, the 
stoles of which have been correlated with those on the Minnesota ap¬ 
paratus test. The oiiginal, lree-iesponse, form of the Minnesota Paper 
Foim P>o.ud was lepoited by Patcison et al. (588) to have a correlation 
of .(>3 with the appaiatus test. I11 the waiter’s unpublished study of NYA 
youths, the con elation between the Revised Minnesota Paper Form 
Board (multiple-choice loim) and the apparatus test was found to he 
.59; Han ell (33b) lound it to be .(>3. No data haw been seen concerning 
relationships between scoies on this twodimensional test of spatial re la¬ 
tions and such presumably three-dimensional tests as the Wiggly Block 
and the Crawfoid Spatial Relations Test, although it would seem to be 
very important to ascertain the telationship between abilitv to judge 
relationships of two-dimensional objects and ability to think in terms ol 
three-dimensional space. It may be that, in working with two-dimensional 
objects, one actually works in three dimensions, mentally turning objects 
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around and over, so that there is no real difference between the two types 

of tests; but this has not yet been denionsttated to be the rase. 

Factor analysis studies ineluding the Minnesota Spatial Relations Test 
have been carried out by Andrew (21), Harrell (335.330), Wittenborn 
(935) and the Staff of the Occupational Analysis Division of the United 
States Employment Sen ice (735). Andrew’s study focussed on the Minne¬ 
sota Clerical l est, but her factor analysis confirmed the existence of a 
distinct spatial factor. Harrell worked with a total of 37 variables, in¬ 
cluding the Minnesota Spatial Relations and Mechanical Assembly 
Tests, the MacQuarrie, the 1 Stencjuist Picture Test, and Thurstone’s 
Primary Mental Abilities Tests. He located fne factors, including spatial 
visuali/ation, perceptual ability, and manual agility; the fitst-named 
factor w T as the important one in the Minnesota Spatial Relations Test, 
although when accuracy was scored rather than tunc the perceptual 
factor also pla\ed an important part. Wittenborn’s analysis ol the 
definitive Minnesota battery isolated only a spatial factor in the Min¬ 
nesota Spatial Relations Test; this factor was iound to be the only one 
ol impoitance in the Paper Form Hoard, the Assembly Test, the Me¬ 
chanical Interest Analysis P>lank, and, most significant of all, the shop 
operations quality criteiion, thus fmthei confirming the conclusion that 
spatial visualization is a distinct factor and the principal factor underly¬ 
ing aptitude for mechanical woik. 

The USES studs, of which only a summary report has been published, 
found that the Minnesota Spatial Relations Test is heavily saturated 
with a spatial lac tor, and that two other lactors play a part in it. One 
of these was a space-perception factor, isolated in this study and in 
Harrell’s but not in Andrew’s or Wittenborn’s, presumably because of 
the smaller number of tests used in the last two studies. The other was 
difficult to define; it has a wider significance than Thmstone's induction 
factor, and seemed to ha\e some of the properties of Spearman’s genet al 
factor; since they used a multi-lac tor method of analysis the authors 
hesitate to call it general intelligence, but considci it more likelv to be- 
that than anything else. Since* the subjects used were adults, aged 17 to 
39, the rinding of a general intelligence factor would be* impoitant, not 
only because it would explain why spatial tests can be used as measures 
both of general ability and of a special aptitude, but also because it 
would contiadict the theoiy of gioup factors which, in Ameiica, has been 
accepted to the exclusion of Spearman’s two-factor theorv. Obviously, the 
USES data must be reported in more detail, and confirmed by other 
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studies, before conclusions of such major importance can be drawn. In 
the meantime, it can lx* concluded that there is a distinct spatial factor 
which is the most important element in the Minnesota Spatial Relations 
Test and in mechanical success, and that perceptual ability docs also play 
a pari in this test. 

Another approach to this question is available through an unpub¬ 
lished thesis by Ticdick (8(19), reported by Goodman (297). In this work 
Tredick correlated stores on the Minnesota Spatial Relations 'Lest with 
iat tor stores derived from Thurstone’s Primary Mental Abilities Tests 
administered to 113 freshman college women. Significant correlations 
were found with the perceptual, spatial, and reasoning factors (.55, .49, 

. 17), and with the other reasoning or deductive factor (.33). These data 
tend to confirm the USES findings in so far as components in the Min¬ 
nesota test are concerned. 

(•Hides and ratings of performance in mechanical tasks were used as 
criteria b\ Brush (122). Trrdick (Hbq), Stanton (719), and Steel, Balinsky. 
and Lang (731). Brush used 10j engineering students at the University 
of Maine as his subjects, correlating spatial relations scores with fresh¬ 
man and four sear grades; Lilt* results were disappointing, the r’s iir both 
cases being .oh. It should be noted here that the Revised Minnesota 
Paper Form Board Gelded \aliclitx coefficients of .92 and .4^, which 
suggests that the hea\icr loading of intelligence in the paper version ol 
the test makes it superioi for predicting success in technical activities 
which are as abstract as college engineering courses. Tredick also lound 
tins to be the case in a cliilcrent college curriculum. The students studied 
1 >\ Tredick were 1 0; freshman students of Home Economics at lire Penn¬ 
sylvania Mate College, her criteria being glades in several courses and 
semester-point-aveiage loi the first semester. Correlations between test 
scores and grades were .20 for Ait. .22 for Chemistry, .02 for English 
Composition, and .23 for semester-point-average. The relationships ate 
in the expected directions, but not high enough to make the test usable 
hv itself; it might have some value in a battery of unrelated tests. 

The nearest approach to a repetition of the original validation of the 
Minnesota tests was made 1>\ Stanton (719), who correlated scores on 
Minnesota Battery A against ratings of shop work performed by deaf 
hoys and gii Is. She worked with 121 bo\s and 36 girls, aged 12 to 1 j - 1 be 
battery as a whole had correlations ol ..{8 and qb with the ratings; the 
validity ol the spatial test alone was not given. While not as high as the 
coefficients reported bv Paterson (588) these are high enough to make the 
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test useful in counseling and selection when combined with other data. 
The work sample approach was used also by Steel and associates, in a 
study already discussed elsewhere in this book. For boys the correlation 
was .25; for girls, .39; as pointed out in a previous discussion, experience 
may have counteracted the effect of individual diflercnces in aptitude in 
the boys more than in the girls, but in both cases the test had some 
^ alidity. 

Success on the job, it has already been pointed out, has not been used 
as a criterion of the \alidity of the Minnesota Spatial Relations 7 ’est. 
Ross ((>51) established a critical score ior machine-tool trainees, setting 
it at the 30th percentile. There were approximately ~}o trainees. Rut the 
criterion was grades in on the-job training. 

Occupational differentiation on the basis of spatial relations test scoies 
was studied first by the Minnesota Employment Stabilization Research 
Institute (30O) and then bv Teegarden (81 (>). In the former stuclv garage 
mechanic's were found to make a median score equal to the 8;,th per¬ 
centile ol the general population, while manual training teachers stood 
at the 73th: ornamental ironworkeis and men office c leaks weie also one* 
sigma or mote abo\e the median (ticjth and With percentiles). Draftsmen 
were, surprising]}, at onlv the 59th percentile; the middle range included 
also such gioups as retail salesmen, bank c leaks, minor c\ocuti\cs, and 
life insurance salesmen, while the lower ranges included janitors (30th 
percentile), policemen, and casual laborers (27th and 2nd peicentiles). 
These diflerences aic about as might be expected, except for the faith 
high standing of the office clerks and the lower standing of the drafts¬ 
men; peihaps the latter would show up better on a papei-and penc il 
test such as the Minnesota Paper Form Hoard, which would seem to 
approximate the medium in which thev work mote closely than does an 
appatatus test. 

The group studied by Teegarden was \oungei and less experienced, 
and her general adult norms were locally established, which makes im¬ 
possible the merging of her occupational norms with those ol the MESRI 
project without going back to the raw scores. Within the limitations of 
her sample, it is instructive to note that there were no groups which make 
significantly high scores, with the exception of male operatives perform¬ 
ing hand work in factories, who stand at the 7 |th percentile, and female 
assembly workers at the 72nd percentile. But women hand operatives 
stand at the 55th, leading one to question the data ior men; the numbers 
were not large, ranging from as few as 22 to 123 workers per group. 
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Women packers and wrappers were at the 67th percentile, men at the 
62nd. All other groups ol men and women were between the 44th and 
65th percentiles. As none of the occupations studied were skilled or tech¬ 
nical occupations, the failure to find clear-cut differentiation is not sur¬ 
prising. The MESRI occupational norms are much more helpful; we 
have seen that they revealed a tendency for technical and skilled workers 
to make high scores, and for others to make average or low scores, de- 
pending upon the intelligence level. 

fob satisfac tion may be related to having a modicum erf the ability 
lequired to perionn the tasks which constitute the job, but the role ol 
spatial \ isuali/ation in \ocational satisfaction has not been investigated. 

Use of the Minnesota Spatial Relations Test in Counseling and Seda¬ 
tion. Data ie\iewed and discussed in the preceding paragraphs make it 
dear that the Minnesota Spatial Relations Test measures at least three 
lac tors, the most important of which is ability to visualize and judge 
spatial relations. Ability to perceive spatial differences is also tapped by 
the test, and indeed it is dillicult to imagine a test of ability to judge 
spatial relations which would be entirely independent erf ability to per¬ 
cent* spatial difleiences and similarities. The third factor is reasoning 
ability, something approaching general intelligence, which plays a pan 
in this test but is less important than the first two factors. Because of the 
common rate ol maturation and because of the fact that abstract reason¬ 
ing plays a part in the test used to measure spatial judgment, some rela¬ 
tionship is found between the spatial relations and intelligence lest 
scores of heteiogeneous groups; despite this, the spatial relations test 
can be thought of as measuring something distinct from intelligence 
when working with homogeneous groups. 

In wot king with college students this means that one can expect a 
large percentage ol a\eiage and moderately high scores, while in less able 
groups one will encounter mote low a\erage and low scores; these must 
be seen in perspective, the counselor realizing that a moderately high 
spatial score in a very able person does not mean special aptitude for 
professional-technical woik and that a high average spatial score in a 
person of low average intelligence may well indicate promise for the 
skilled trades. 

Changes with age were seen to take place up to about age 14, after 
which it appears that the aptitude is relativelv stable. More work needs 
(o be done before this can be considered conclusively demonstrated, but 
it seems a safe working principle. 
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Occupationally viewed, the Minnesota Spatial Relations Test meas¬ 
ures an aptitude which is found in a higher degree in workers in skilled 
trades and professions such as automobile repair work, manual training, 
and ornamental iion. This is true also of workers in semi-skilled occu¬ 
pations in which job analvsis suggests that spatial judgment should be 
important; these have been found to include hand-working operatives 
in factories, assembly woikers, and packers and wrappers. Although one 
would expect draftsmen to excel on a test such as this, the one study 
which included such workers found that they were only high average in 
spatial ability as measuied bv this test. T his seems somewhat anomalous 
and indicates a need lor caution in making assumptions concerning the 
test; further studies should be made of the lelationship between drafting 
success and scores on this test. Most office and minor executive gioups 
make moderate scores on the test, presumably because they tend to be 
of moderate intelligence. Semi-skilled and unskilled workers in occupa¬ 
tions not requiring spatial visualization tend to score average or below 
on this test, because of selection and because they tend to have less 
genera] intelligence than other workers. 

In schools and colleges the Spatial Relations Test is useful for selecting 
students who ate Jikclv to do well in shop couises, although it is of less 
value for the more abstract types ol technical training than for the more 
concrete. 

In (iuidantc Centos and Employment Service Offices the* test can be 
helpful in cases of c lients considei ing the choice ol technical occ upations, 
especially at the semiskilled and skilled levels lor which a paper form 
board is sometimes too abstract. It has value in helping in the* choice* of 
trade and technical naming, and in detei mining a client’s piospects of 
making a quick adaptation to the demands of ceitain semiskilled jobs 
for which training is offered during the induction period, these latter 
include especially woik such as assembly of vaii-lormed pans, machine 
operation, and packing objects ol different shapes and sizes. 

Business and industrial personnel woikers should find the test useful 
in selection of the* type just described above. As an aptitude test it is 
most useful, obviously, in selecting people for naming in skilled occupa¬ 
tions; this will happen most often in schools, but also to some extent in 
industry in connection with apprenticeships. It can have much greater 
value in industry in the selection of semiskilled employees who can 
quickly adapt to new jobs, who can readily master procechues of machine 
operation or assembly, and who, because of the speed and accuracy with 
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which they judge size and shape, will produce more per hour of work 
and do it with Jess waste of materials. 

The Minnesota Paper Form Hoard , Likert-Qjiasha Revision (Psychologi¬ 
cal Corporation, 193 j) 

Tlie lit st form of the Minnesota Paper Form Board, used in the Min¬ 
nesota Mechanical Ability Pioject (588), was a completion test based on 
the Geometrical Construction subtest of Army Beta, the non-verbal in¬ 
telligence test developed by the l.S. Army during World War I. Since 
the scoring of completion items is laborious and subjective, requiring 
that the- scorer sciutini/e each response and make judgments as to its 
adecjuacy, it seemed high!) desirable to find some way of converting the 
Minnesota Paper Form Board into a multipIc-choicc test. I bis was done 
In Likert and Ouasha, unfortunately not early enough to be included in 
the MLSRI studies (hi7). However, the early Minnesota studies of the 
completion test are probably indicative ol the nature of the validity of 
the revised test, and a variety of minor validation studies have been 
made with the revision. 

AppJu ability. Army Beta was designed lor and standardized upon 
unselected adults, the completion form of the Paper Form Boaid was 
developed for a studs using earls adolescent subjects, and the multiple- 
choice revision v\as designed lor use with and standardized upon ado¬ 
lescents and adults. I he directions are simple enough for children nr the 
upper grades; the range ol dilhcultv ol the items is such that 10-ycar-old 
boys make 1 a median score ol 22 compared with the adult male 1 median 
ol 3), the r,th percentile in each case being f» arrd 16, indicating that 
individual difleiences are revealed at both age levels. The items seem 
to have a reasonable amount ol challenge at all age levels, despite their 
abstract form. 

The effects ol maturation can be studied in two of the sets of data 
provided bv the 19 p test manual. One of these consists of the age norms 
lore), 12 and 1 5-v car-olds irr the schools ol Kearney, New Jersey, the other 
of data for grades lour and live in the Bronx. In the former instance a 
25-minute time limit was used, instead of the usual 20-minute limit, d ire 
median scores for the three age levels (boys) were 18, 52, and 38, revealing 
a more rapid increase in the six years from 9 to 15 (three points pci 
annum) than irr the three seats from 12 to 15 (two points per annum). 
This suggests that the gtowtli of this ability begins to level oil early in 
the teens, although it does not indicate the age at which the plateau 
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begins. The grade data confirm the changes in pre-adolescence, but go 
no further. Lefe\cr and others (459) found no relationship (r = —.14) 
with age in adults. As in the case of the MacQuarric, Mitrano (r,;p4) has 
drawn some conclusions concerning age changes which aie based on 
spurious factors in his data, and are therefore unwarranted. Studies 
should be made which would throw more light on the question of lin¬ 
age of le\elling-ofI, to make possible the construction of developmental 
conversion tables such as are needed when the scoies of growing indi¬ 
viduals are to be compared to those of matin e prisons established in an 
occupation. The grade norms in the 194S manual tluow no more light on 
age differences. 

Content. The test consists of (>j items. Each item is made up of a 
“stem” and five possible choices from which to select an answer. I he 
stems are the disarranged pails, from 2 to 5 in number, ol a geometiic 
figure. The responses are assembled geometric figuies, onh one of winch 
could be made by putting the parts of the stem figuie togethet. 1 h<• 
problem is in each case to select the figuie which coriesponds to the 
assembled parts, which must sometimes meieh be mentallv pushed 
together in older to make an appropriate whole, and sometimes mental!\ 
turned around or even over. The items therefoie resemble- those of tin 
real form boat cl, except that there can be no ti ial and-eri 01 work with 
the Paper F01111 Board: all the matchings of shapes and si/es must lx 
done men tail). 

Administration and Scoring. The test is preceded b\ practice pioh- 
lcms, with 20 minutes of working time allowed for the test pioper. It is 
necessary to demonstrate how the booklet opens, and to be sine that tin- 
many examinees who prefer to follow r their own \j\ual cues lather than 
the ps\c hometrist’s spoken diiections do actualb observe the demonstia 
tion. If this is not done conectl\, some booklets will be- turned in with 
a page of easier problems skipped and some mote difficult ptobleun 
attempted, making scoring impossible. Scoiing is clone b\ matching 
marked spaces with a key, is objective and simple. Forms adapted to 
machine scoring have been published, with special 11011ns. 

Norms. Because of piecemeal standardization of the I akert-Ouasha 
revision the norms for the test are rather unsatisfactory. Series AA and 
BB grade norms for 9th and 10th grades and high-school seniors are based 
on guidance center clients in the fust two instances and on students 
applying for admission to the arts and engineering colleges of New Y01 k 
University in the last, certainly not a typical group of high school seniors 
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since it omits the 80 to 90 percent who do not go to college. The college 
ireshmen were students at New York University; the freshmen engineers 
were at New York University and Northeastern University, no significant 
difleience having been found between engineers in the two institutions. 
We have already seen, in the chapter on intelligence tests, what great 
differences exist between schools and between colleges; these norms are 
of local value, then, but can be no more than a rough guide to counselors 
or admissions officers in other communities and institutions. 

Series MA and MB norms are considerably better, the 9th grade norms 
representing three large cities, and the 12th grade norms. Go New England 
schools. 

Stephens (75G) administered the Revised Minnesota Paper Form Board 
to 293G seniors and 3332 juniors, male and female, in all curricula in 
New England high schools, publishing norms based on them As he 
points out. these arc* higher than the old national norms, which we have 
seen tobestiicth local. The 19 jS manual includes these norms, expanded 
b\ additional cases from subsequent samples. 

ILmman (329) tested 78;, men in the educational program of the WPA 
in California, ranging in age* lrom 20 to G;, with a modal age of 40. Their 
education \aiietl as greath. from none to the doctorate, with mode at S 
to 9 \eats. I he* author concluded from his data that the old so-called na¬ 
tional norms were too high {the\ were then based on 7G cases), lading, 
appaienth. to take into account the fact that his was a selected, although 
large*, sample, heavily weighted toward the lower end of the scale ol 
education and abilitx. Such heterogeneous and skewed norms have the 
\allies and uses ol neither homogeneous and skewed nor ol heterogene¬ 
ous and representatiw norms. 

The* sample studied by Baldwin and Smith (38) consisted of 975 women 
cmploud b\ the Eastman Kodak Co. The group was divided into rG to 
2yvear-olds and 2G to Go \ ear-olds, norms for the younger group being 
somewhat higher than the original norms and those for the older* group 
being somewhat lower. Although this is in no sense a cross-section o! 
adult women, and the norms are not general adult norms, they are use¬ 
ful in that thev depict a large occupational population of varying skills. 
The jobs to which they were assigned included unskilled repetitive jobs 
such as lens wrapping and highly skilled precision jobs such as final 
assembly and inspection of optical and mechanical equipment. The icjj8 
manual includes these and other* local hut useful industrial norms, each 
set of which needs to he carefully studied bv users. 
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Standardization and Initial Validation. The completion form of the 
Minnesota Paper Form Board was one of the best tests in the Minnesota 
Mechanical Ability Battery; it had a correlation of .63 with its apparatus 
counterpart, and a validity coefficient of .52 against ratings of quality of 
shop work. In revising the test and making it more objective the multiple- 
choice form was used, practice problems were included to insure under¬ 
standing of the task, stencil scoring was utilized, and three time limits 
were tried out, the intermediate limit proving to be the best. The test 
went through two revisions, and was standardized on college students. 
High-school norms weie then added. It \ielded a correlation ol . jo with 
the Otis S.A. Test based on college students, and a correlation of .7r, 
with scores on the original completion form. Validity of the test was 
assumed to be demonstrated by the correlation with the original form 
and by the validity of that form; it was also ascertained by correlations 
of .49 with the mechanical drawing grades of engineering students and 
.32 with grades in descriptive geometry ((>17). 

Reliability. The uncollected reliability based on the intercorrelation 
of the two revised forms ol the test was found to be .79, while the split- 
half reliability was, corrected. .92 ((>17). 'This latter figine is made spu¬ 
riously high, however, In the speeded natiue of the test. The letest 
reliability after periods of one or mote vears had elapsed was ascertained 
by Ebert and Simmons (233) with children aged 10 to 1 j, the age 1 gioups 
varying in number from 73 to mo. For lo-seai-old chilchen retested at 
age 11 the reliability coefficient was .87, at age 12 .8(>; for 12-vear olds 
tested again at ages 13 and 14 the reliabilities were .87 and .80. It can 
safely be assumed, then, that the reliability is actually in the .80s and 
sufficiently high for individual diagnosis. 

Validity. A criticism of time-limit tests such as this which is occasion¬ 
ally made by examinees or observers is that the imposition of a time 
limit makes the test a measure of speed and prevents it bom measuring 
adequately the trait which it is designed to measure. We have already 
seen that Baxter demonstrated the independence of speed (the time 
required to attempt every item once) and level (the number of items 
correctly answered in unlimited time), in the Otis intelligence test 
(p. 108). l inker (847) studied the roles of speed and level in the revised 
Minnesota Paper Form Board, confirming the finding that they v.vv 
independently. Scores obtained in a standard time limit were found to 
consist primarily of speed, with level of difficulty at which the subject 
could work playing a lesser part. Apparently tests would generally be 
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improved if they weie administered as level tests, making possible the 
moie nearly pure measuiemem oi the trait being assessed, but the mixed 
speed and level stores now obtained lor most tests are useful despite their 
impurity. 

Intelligence having been measured by tests which included spatial 
judgment items, one oi the lust steps in validating the Minnesota Paper 
Lot m Boaid has been to ton elate its stores with scores on tests of genet al 
mental ability. Saitain used two groups, one consisting of 4b inspectois 
in an aim ait lac tot y (bbcj) and the other 40 foremen also employed in an 
a ire 1 a 1 1 factors ((*>71). Roth gtoups took the icvised Paper Form Board 
and the* Otis S.A. I cst, the* c 01 relations being .(12 and .49; the reasons lot 
the gteat difference ate not cleat, although the ioremcn may he a mote 
homogeneous gtoup. Lite writer’s intercot 1 elation ol tests administeicd 
to too NY A youth Welded a i clationship ol . p lor the same two tests, 
which agtees not onl\ with Sat tain’s foreman data but also with the 
1 clationship reported bv Quasha and Likert (hi/). 1 he NYA group was 
ralhet betelogeneous. 

I he Ametican Council on Education Psychological Examination was 
correlated with Paper Form Board scenes in a study by Traxlcr (864), 
with 240 Mcichant Mat me Cadets as subjects. The corielation of total 
scotes was . p; lot the linguistic scenes it was .4 j and for the cjuantitati\e 
it was . ji. Bivan (12;) tested ait-school freshmen, con elating A.C.E. part 
scotes and Papet Form Boaid scotes; for the spatial subtest oi the A.C.E. 
the coiiclation was .44. Arnn Alpha lias been found (4 jo) to ha\e intet- 
cot relations with the Reused Minnesota Paper Form Board which 
ranged liom .44 and .41 at ages 14 and 14 (X = 159 and 109) but fell, 
unaccountabh, at ages 14 and ib to only .11 and .17 (X = 8b and 44). 
When the Revised Papet Form Board was correlated with the patent 
te st ((iconic! 1 ic al Const! uclion) of Army Beta (44b) the coefficient was 
lound to be .47. considerably lower than that of .74 between the original 
and icvised lorms ol the Minnesota Paper Foim Board referred to 
earlier. The subjects weie 9th glade hoys in the Army Beta study, but 
college students in that of the two forms, which suggests that the larger 
correlation ma\ have been obtained with the more homogenous group. 
If this is so, then the revised test is more like the original Minnesota 
test than like the part ol Army Beta from which they both originated. 

Manual dexteuty is not an aptitude which one would expect to find 
playing a pai t in a spatial test as abstract as this is, but two studies have 
provided evidence concerning the degree of relationship. Thompson 
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(824) found 110 relationship between either the Finger or the Tweezer 
Dexterity Test and the Revised Paper Form Board (—.08 and —.15); the 
writer obtained a correlation of .23 with the Minnesota Manual Dex¬ 
terity (Placing) Test. The true relationship is presumably about zero. 

Mechanical comprehension has been seen, in the preceding chapter, 
to include spatial visualization among its components. For this reason 
there is little to be gained here by repeating the data concerning the 
relationship as shown in \arious studies. It should suffice to summarize by 
stating that the con elation with the O’Rourke is generally found to be 
about .40, with the Bennett about .35, with the Minnesota Mechanical 
Assembly Test about . j8 (one study only), and with the MacQuarrie 
about .35. This means that a test of so-called mechanical aptitude may 
contribute materially to the prediction of success even when a good 
measure of spatial relations is used, for the score on the latter only partly 
accounts lor the score on the former. 

Spatial Visualization as measured by apparatus tests such as the Min¬ 
nesota Spatial Relations Test, the Crawford Spatial Relations l est, and 
the Wiggly Block should be correlated with the same ability as measured 
by the Paper Form Board, in order that the instiuments and the 1 trait 
may be better understood. 'Flic* writer found the correlation lor the 
Minnesota test to be .59, his subjects being 100 NVA vouths. Jacobsen 
(396) lound that for the Crawford to be* .20, based on data from 90 me 
chanic learners. Estes (240) reported that for the* Crawfoid as .2(1 and 
that for the Wiggly Block as .31, with data obtained from 7b etigitun¬ 
ing freshmen. Jacobsen’s study, it will be lemcinbered, reported a numbei 
of deviant results; disregarding it, therefore, we find only a moderate 
agreement among these rather difletent-appearing tests of spatial rela¬ 
tions. The fact that the Paper Form Board is more heady satm ated 
with general intelligence or inductive leasoning than the apparatus test 
explains at least a part of the failure to agree mote closch. Jt is also 
possible that there are differences between two- and tin re-dimensional 
spatial judgment, as the Crawford and Wiggly Block attempt to measure 
it; and it is true that when a test is as unreliable as the Wiggly Block it 
cannot often yield significant conclations with anything. 

Interest in mechanical and scientific activities as measured by Kuder's 
Preference Record was correlated with Paper Form Board Scores by 
Sartain (671), who found it to be negligible (r = .13 and .19). As the 
group consisted of foremen in an aircraft plant, who might be assumed 
to be homogeneous as to mechanical and scientific interests (high on the 
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former but low on the latter), this probably does not tell us much con¬ 
tenting the relationship between technical aptitude and interest. 

Three factor analysis studies involving the Revised Minnesota Paper 
Form Board threw further light on the subject of the traits measured by 
this test. Morris ( 5 jip analyzed the intercorrelations of scores made by 
56 9-ycar-okls to whom the Pintncr-Paterson Scale of Performance Tests, 
the Porteus ma/cs, and Henmon-Nelson intelligence test, and others were 
administered, together with the Paper Form Board. He found three 
group factors, which he called spatial relations, perceptual ability, and 
ability to discover patterns or a rule of procedure (induction). These 
lesernble those found in the studies of the Minnesota Spatial Relations 
Lest. Murphv (r,-r>) used the Paper Form Board together with the 
Terman (houp Lest of Mental Ability, the Revised Army Beta, the De¬ 
troit Mechanical Aptitude lest, the MacQuarric, and others, testing 
1 pj c)th grade bovs. Three factors emerged from this analysis: mental 
manipulation oi telations expiessed symbolically (presumably induction), 
speed ol hand and eve c 0-01 dination (in the MacQuarrie particularlv). 
.ind mental manipulation ol spatial relations (in the Paper Form Board, 
j > a 1 1 s of the MacOnanic* and Dctioit, and pat t of Army Beta). Lstes (2J0) 
gave 1 the Papei F01 m Board, (aawford Spatial Relations, Wigglv Block, 
.ind ATT. Psvc hological Lxamination (L and O scores) to 7b engineering 
Ireslmu n. \ lac tor analvsis tevealed one common factor, but thb may 
be due at least partlv to the small number ol tests. The implication, il 
( oi reel, is that two- and three-dimensional tcMs of spatial judgment 
measme the same spatial factoi, although imperfectly because ol the 
dilfetcut media. Until further evidence is available, it seems legitimate 
to conclude that the- Revised Minnesota Paper Form Board measmes 
spatial 1 elalions, perceptual ahilitv. and inductive reasoning, in that 
01 dei, and that although it measmes spatial judgment by means of two- 
dimensional media this ahilitv is the same as that mcasuied bv thiee- 
dimensional means. 

Grades and ratings ol promise in training have been used as criteria 
in a dozen studies with this test. Stanton (7 lh) administcied the original 
form to deal bovs and gills and obtained a correlation of .50 between 
scores and ratings ol shop performance. Jacobsen (‘}c)f>) used it with he 
tween 80 and <)o mechanic learners in a wai industrv, found that it 
correlated between .18 and .22 with fitness ratings, but the probable 
errors were so large as to make the relationships insignificant. Ross (br } U 
admiiiAtried the- Paper Form Board to p machine-tool trainees, hut 
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published no correlations. Apprentice pressmen were studied by Hall 
(325), with ratings of skill as a criterion; the correlation was .58. An 
attempt was made to differentiate between “good” and “poor” classes in 
an industrial and technical high school by means of the Paper Form 
Board, but Morgan (5.J1) reported failure to discriminate; his subjects 
were 319 8th grade boys applying for admission to a technical high 
school. 

Several studies have used engineering students as subjects. Berdie (78) 
obtained a low but significant correlation (.22) between test scenes and 
honor point ratios of 15j engineering students. At the University ol 
Maine, Brush (122) studied a group of more than 100 students, obtained 
correlations of . j2 and .175 with first-year and . 13 and .21 with four-year 
grades. Physics grades at the University of Iowa were found to have a 
correlation of .2b by Stuit and Lapp (788). It can be concluded that the 
Revised Minnesota Paper Form Board does have value in selecting stu¬ 
dents for or guiding them in the consideration of engineering training; 
Brush found it one ol the best aptitude, as conti asted with achievement, 
tests in his extensive battery, and it found a place in some of i 1 is best 
regression equations. 

Dental students were tested l>v Thompson (82.j), coirelations with com¬ 
bined grades and ratings of 35 ficshmen and jo seniors being respectively 
.2) and .bi; the difletence is surprising, even when allowance is made for 
the fact that more professional work is included in the senior than in the 
freshman year. 

Art students have been studied with the Revised Minnesota Paper 
Form Board, on the assumption that spatial judgment is important in 
layout and related work. Barrett (jr } ) found that jo art majors at Huntci 
College were significantly superior to jo control students in spatial judg¬ 
ment, although the actual diflerence in scenes was small. Thompson (82 \) 
obtained a correlation of only .18 between the test scores and point-hour 
ratio for 50 fine-art students, lb van (123) used art grades as a criterion 
reporting a validity of .19. 

Success on the job has been studied more frequently with this test than 
with its apparatus counterpart, thanks to its group procedure. Aircraft 
factory workers were studied by Sartain and Shuman in studies aheady 
described. The former tested qG inspectors and 30 foremen (bbq.tg 1), 
the latter 2(13 engine and propeller workers, both skilled and semi¬ 
skilled (717), and 297 supervisors of several grades (71b); ratings wore the 
criterion in all instances. Validity for Sartain’s inspectors was ,.j7, for his 
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iorcmcn only .10 (as high as any in this study). For Shuman's workers it 
ranged from . 1 6 to .59, depending upon the job: moderately high correla¬ 
tions (.38 or above) were found lor inspectors, machine operators, fore¬ 
men, job setters, and toolmaker apprentices; the only low coefficient was 
lor engine testers, for whom the* Bennett Mechanical Comprehension 
Test had equally low validity, and for whom the critical scores on both 
tests weie low, which suggests that the job may have been more clerical 
than mechanical. In Shuman’s other study the validity ol the Paper Form 
Board for supervisors was found to be .33. The test would have impro\ed 
selection by approximately 15 percent in each ol Shuman’s studies. 

Inspector-packers in a pharmaceutical concern weie subjects ol a study 
by Ghiselli (286), already described. Ratings served as a ciiterion ol 
the success of the 26 girls, for whom the Paper Form Board had a 
validity of .57. Stead and Shartle (750) report a correlation of only 
—.01 between scores on this test and ratings of .j 1 inspector-wrappers, but 
as they do not describe the job it is impossible to dctciminc whethet or 
not this finding is in conflict with Ghiselli’s. For can packets and merchan¬ 
dise- pac kers they found \ aliclities of .28 and . |8, lor two groups ol power- 
sewing-machine operators .31 and .38, and tor put-in-coil gills the aston¬ 
ishing hguie of -.r,2. 1 his last gioup made- the highest mean scoie ol am 
tested b\ the I NKS reseat ch piogram as iepotted hy Stead and Shartle. 
Perhaps the) weie an able gtoup who, boied bv their toutine jobs, actu¬ 
ally tended to pioduce less than the less able gills. File criteria in the jobs 
mentioned were based on output, and the numbers of subjects tanged 
It out 18 to .|(i. For lamp-shade sewers and pull-socket assemblers, also 
tested in this investigation, the validities appioached /.eio. 

()((upatiomil difjn ernes in spatial visualization as measured by the 
Revised Minnesota Paper Form Boat d ate suggested bv Barrett’s study 
of Hunter College ait majors (py) which showed slight but significant 
diflercnees between these students and control students in other fields, 
and bv that ol the USFS (750b which found that put-in-coil girls and 
lamp-shade sewets were high average when compared to clients ol the 
Adult Guidance Bureau ol New York, and that the other wot kers listed 
in the pieceeling paragiaph clustered aiound the 35th percentile. The 
not 111s in the manual indicate, rather more helpfulh, that engineering 
Ireshmen, at least in New Y01 k Lhiiversity, tend to score about five points 
higher than liberal arts freshmen, and that upper classmen in engineering 
curricula score about four points higher than Ireshmen. Barrett’s art 
majois made an ave-rage score equal to that of the engineering upper 
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classmen (raw score = 47), her controls an average equal to that of the 
engineering freshmen (raw score = 43) rather than the liberal arts fresh¬ 
men, but the Hunter students were all upper classmen. More comprehen¬ 
sive and varied occupational norms are badly needed for this test. 

Satisfaction in a professional curriculum, if not in the occupation itself, 
has been studied with the Minnesota Paper Form Board. Berdie (78) 
gave the revised form to 15 j engineering students and obtained curric - 
ulum satisfaction data bv means ol a modification of Hoppock s Job 
Satisfaction Questionnaire. The con elation between spatial visualization 
and curricular satisfaction was only .06. I he study can probably not be 
(onsidered definitive, because a curriculum is something abstract and, 
unfortunately, often somewhat unrenl to the student, whereas a job 
is usually something rather tangible. Engineering students in particular 
are likely to be critical ol the academic, despite ability and interest in 
technical matters. A study ol vocational or job satisfaction might there¬ 
fore vicld different results. 

Use of the Revised Minnesota Paper Form Board in Counseling and 
Selection. Although the Minnesota Paper Form Board is found to have 
a moderately high con elation with tests of general intelligence, more* 
refined analvses have demonstrated that it is piimaiily a test of spatial 
iclations, a special aptitude or distinct factor, and that the test is also 
somewhat saturated with quantitative perception and inductive factors. 
It is the presence ol this last, combined with the fact that some intelli 
gence tests include spatial items, which makes the test conelate signify 
cantly with general intelligence tests. A spatial relations test may then 
fore make a distinct connibution to some test batteries. 

Maturation of ability to judge spatial relations seems to come in the 
early teens, with little if am incieasc after age 13 or if>. This suggests 
that adult occupational norms should be usable with high school juniois 
and seniois, and perhaps even with sophomores. 

Occupations fen which the test has been found to have significance 
include professions such as engineering, ait, and dentistry; skilled tiades 
such as toolmaking, job setting and aircraft engine inspection; and semi¬ 
skilled jobs such as inspection and packing of merchandise, cans, and 
other objects, power-sewing-machine operation, and electrical assembly 
Supervisors and foremen of both skilled and semiskilled workers also 
tend to make- superior scons on this test. 

In schools and colleges the test should be found useful for counseling 
concerning the choice of trade courses, engineering curricula, dental 
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training, and the professional study of art. Presence of the trait in a high 
degree cannot be considered a good prognosticator of success, because 
of the importance of oilier aptitudes and traits, but its relative absence 
in an individual can be considered a danger signal. Despile the importance 
of spatial visualization in tests of so-called mechanical comprehension, 
the correlation between these two types of tests is low enough to prevent 
the use of both from being a duplication. 

In guidance and employment tenters the use of the test can be compa¬ 
tible to that in educational institutions when choice of training is in- 
\ol\ed. It can be of value also in selecting individuals who are likely to 
adapt quickly to the demands of assembly work and machine operations 
in new jobs in which they might be placed. 

In industrial personnel work the Minnesota Paper Form Board can be 
valuable in the selection of adaptable workers for semiskilled employ¬ 
ment, for the evaluation of workers on the job whose skills may be most 
readily utilized in new assembly or machine operations, and also in the 
selection of apprentices for training in the skilled trades. In any such 
selection or evaluation progum other indices should also be obtained, 
and here too a good mechanical comprehension test, an intelligence test 
oral trade tests, and evidence concerning leisure-time activities which 
thiow light on aptitudes and interests mav be important data. 



CHA PTEH X II 


AESTHETIC JUDGMENT AND 
ARTISTIC ABILITY 

ARTISTIC ability lias been broken down into six factors in studies 
conducted out the past twenty veais by N. C. Meier and bis students at 
the State Universitv ol Iowa (rjiij). The analvtic procedures used weie 
partlv biographical, partly mensural, and should not be contused with 
the more objective procedtnes ol iactoi anal)sis; but in tlu* absence ot 
analvses utili/ini> completely objective* methods Meiers conclusions after 
years ol reseaich piovide the best available insights into the nature ol 
artistic ability. 

The six lac tors listed bv Meier include manual skill, as evidenced in 
studies ol the iamilv histories ol artists; energy cnitput and pci seneuitmn , 
revealed in studies ol biographies; aesthetic intelligent c, bv which Meiei 
means spatial and peueptual aptitude as measured bv 1 hurstone’s tests; 
perceptual facility, or the ability to observe and lecall se'iisoi v expel ienc es. 
which this waiter cannot distinguish death liom the peiceptu.il ability 
just mentioned, evidenced in biographical mateiial and in a test ol lecall 
of observed mateiial alter intervals ol 10 davs and ol (i months (mii); 
creatine imagination, delmed as an ability to oigani/e vivid se nse impres¬ 
sions into an aesthetic pioduct, tiait concerning the* existence ol which 
no satislac tor v evidence* has been adduced, save tbe* aesthetic pioduc t 
itself and the uniqueness ol ink blot intei pietations (i>i2) which may 
actually indicate* personality deviation; and aesthetic judgment, consid¬ 
ered to be the most important single lac tor in aitistic ability, delmed as 
the ability to recognize unity of composition and bedieveel by Meier to 
be not the application ol a series of rules, but rather something which 
is innate in the neuro-physical constitution and modifiable by experience. 
Some of these* factors, such as aesthetic intelligence, aie treated as com¬ 
plexes which can or will be broken down into undeilving unitary trails 
of the Thurstone variety; others, like aesthetic judgment, are considered 
themselves basic and unitary. 
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As in any occupation, success in art may be clue to various combina¬ 
tions ol the abilities and traits just described. The* artists whose lives 
Meier has studied ate believed to have excelled in some of the abilities 
listed, although not necessarily in all. Meier cle\eloped his Art Judgment 
Test first, because ol his conviction ol the primary importance ol this 
aptitude; his plans lor subsequent work, financed by the Spelman and 
Carnegie Foundations, called for the development erf tests foi the two 
other abilities in his list which arc* not presently mensurable, namely, 
perceptual facility and creative imagination. The writer is unaware ol 
am practical tests resulting from this work. Of the three other traits, the 
manual and intellectual lactois are currently well measured by existing 
tests, already described, while the emotional characteristics, as we shall 
see, have so iar not lent themselves to satisfactory measurement. 

In appraising artistic piomise it would therefore seem well to use, a), 
tests ol intellectual abilitv, particularly those tapping spatial factors: 
I reboot and Meier <S j j) found that 50 outstanding aitists selected from 
listed iti the Biographv ol American Artists had an average Otis 
1.0. of 1 1S, with their successes predominantly in the veibal and spatial 
items, b), tests ol manual dexterity, although, as we have seen in that 
chapter, there* is little in the* way ol normative material to assist one in 
test interpretation (presumably an average score or better would be 
desired); and. c ). tests ol aesthetic judgment, discussed in detail in this 
chapter. Other data must be gatheied by means of techniques other than 
tests. These might include the expert appraisal of the counselee's sketches, 
paintings, or other art products; the summari/ation of experience in 
aitistic avocations and activities; and the evaluation of motivation to 
persevere in art as shown in discussions of artistic activities and aspira¬ 
tions. 


Aistiutic JrncMrvr 

Aesthetic judgment emerges as the one trait in Meier’s list of six which 
may be considered a candidate for discussion as a mensurable special 
aptitude not dealt with in this book under some other heading. It is for 
this reason that it is singled out for treatment toward the end of our 
list of special aptitudes, and before batteries of aptitude tests and meas¬ 
ures of personality and interest are taken lip. 

There are two well-known tests of aesthetic judgment: the Meier Art 
Judgment Test (a revision of the Meicr-Seashore Art Judgment Test) and 
the McAdory Art Test, the original editions of which were both published 
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in 1929. The graphic material used by Meier was more or less timeless, 
lor it included inastetpieces ol art which appear to be able to withstand 
the temporary shilts ol fashions and of schools; that used by McAdory 
was more transitory, for it included textiles, clothing, furniture, and 
architecture, and it need hardl) be pointed out that the chesses and 
automobiles of the late 1920’s no longer seem to represent the acme of 
good taste in composition. McAdory lias until recently done* no Imtliei 
work with her test (now being ie\ised), while Meier has maintained his 
interest and his production. The McAdory is foi the time being ol purely 
historic interest; a summary ol woik with it will be found in Kintei (j2rp 
Other similar tests ate too new to ha\o been studied. The Meier alone 
will be dealt with in this book, as the only art judgment test ol practical 
significance for the psychometrist or counselor. Two tests of so-called 
creative artistic ability, both in reality worksamplcs, arc also biieflv 
treated here, lor lack ol a more appropriate place. 

The Meier Art Judgment Test (Btneau of Educational Research and 
Service, 1940 Re\ision) 

The first edition of this test, published in 1929, was known as the 
Meier-Seashore Art Judgment Test. It was revised and published as the 
Meier Art Judgment Test in 19 jo. During the intervening \ears Meiei 
and his students at the State Tniversit) ol Iowa conducted a number ol 
important studies in the nature* of aptitude for artistic work, summaii/ed 
in the 1941 Yearbook of the National Society lor the Study of Education 
(520), in a brief monograph chapter ((>(>2), and in his bioader tieatisc on 
Art in Human Affairs (521). Meier’s perseverance in the study ol aitistic 
ability has given his institution a leading place in this field which has 
been rivalled only by the leadership in the study of musical aptitudes 
which it exercised under Carl Seashore; it is inteiesting to note* that a 
mid-western state university has led in the “impractical” field of aesthetic 
research. 

Applicability. The revised form of the Meier Art Judgment Test, like 
its predecessor, has been standardized on junior and senior high school 
students and on college students. Greene (309:395) points out that the 
grade norms lor the Meier-Seashore Test show neatly chance success at 
the 8th grade level, and leleis to other studies which showed that the 
ranking of pictures by 10-year-olds was similar to that of average adults, 
that of 7-ycar-olds already showing considerable agreement. As the latter 
studies did not use the same types of materials as the Meier tests, it is 
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not possible to draw precise conclusions from a comparison; it seems 
probable, however, that the judgments requited by the Meier tests ate 
more refined than those involved in the other studies, and that this more 
refined type of aesthetic judgment matures later. The median score lor 
junior high school students on the revised form is 88, whereas that for 
senior high school students is 99. This dillerenre is picsumably due in 
part to selection, but, as the test has a very low correlation with intelli¬ 
gence, it may be concluded that it is due primarily to developmental 
differences. Meier seems to attribute this largely to experience in his booh 
(521:131) but not in the manual (pp. 15-ib). Apparently aesthetic judg¬ 
ment is still developing during the middle teens, making age norms 
necessary. As in the case of so many other tests, tables making possible 
the conversion of age-group percentiles into occupational percentiles 
would be highly desirable. It is noteworthy that training in art has been 
found to ha\e little effect on score's (139). 

Content. I he Meier Art Judgment Test, 1910 revision, consists of 
100 pahs of pictures, printed in booklets with one pair per page on one 
side of the sheet onl\. The pictures are largely paintings, sketches, etc., 
which are genet alls recogni/ed as woiks of permanent merit; others ate 
vases and othet olqcts-d'mt; all were included because of agreement 
concerning their merit by a group of established artists and because 
of high biserial r\s with total scores. In each pair, one member is the 
unaltered reproduction of the original woik, while tlie othet meinbci i> 
a slightly modified version. The modifications are designed to make the 
composition, form, etc., less pleasing to the eye; the natuic ol the difler- 
eiue is pointed out to the examinee. The examinee’s task is to decide 
which pictute lie ptelers in each pair, with no knowledge of which is the 
otiginal picture (the paintings are not so well known that subjects ate 
likeh to recognize the original). 

Administration and Sun nig. The Meier le st can be administeird 
either individuallv or in groups, but, as there is no time limit and there 
is great \ariation in the amount of time required to complete it, it is 
not a convenient test for group administration at any time other than 
the end of a test battery. It is usually completed in less than one hour. 
Scoring is by means of a stencil, is simple and objective. 

Norms. The 1942 manual for the revised test provides norms lor 
144r, junior high school. 892 senior high school and 982 college art school 
students. The students were “interested in art” “lor the most part.” 
making the norms representative of neither general population nor art 
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students, except at the college level. The 25 schools represented weie 
scattered throughout the whole United States. 

Standardization and Initial Validation. In the initial standardization 
work lor the earlier form of the test nearly Goo pairs of items were tried 
out on over 2000 pupils in various types of schools and colleges. The 125 
which were then retained were those which had the most discriminating 
value and those which were most lavored by a group of experts. The 
current revision includes 100 items selected in a similar way dining the 
eleven years intervening between the two forms of the test. The method 
of selecting the items may perhaps be consideied evidence of validity, for 
the answers scored “tight” are those* which are chosen by high scorers on 
the test as a whole, and are those which are chosen bv established artists. 
That no established artists made low scores, and that some untrained 
persons made high scores, was taken by Meier as an indication that the* 
test measured an aptitude rather than the eflects of specific ttaining (522), 
although lie has since modified his point of view to allow a somewhat 
more important toll* for expedience (;,2i). Correlations with intelligence* 
test (Terman Group, Standard Binet, Thorndike) scoic*s weie found to 
range from -.i.j to .28, indicating that it was not a nieastue of general 
intelligence. Comparable data for the new lorm have not been published. 

Reliability. The earlier form ol the* test had retest reliabilities which 
ranged born .fit to .Gr, for non-ait students, and from .Gq to .8;, for art 
students (518; Leighton cited bv j2r ) ;iji). These are lower than is de- 
sirable in a test used in individual diagnosis, making caution nc*cessary 
in its use. The* reliability of the* ic)jo revision ranges from .70 to ,8j, 
those two lowest be ing based on students of Pratt Institute and a junior 
high school, the two highest in an art school and a senior high school 
(giades not specified). It is to be regretted that thev weie not raised for 
more accurate diagnosis, but as Meier points out the test is really only 
a screening de vice*, which makes the reliability adequate. 

Validity. All but a few of the published studies of the* Meier Art 
Judgment r Icsts are based on the older edition. Recause of the similarity 
of the two revisions they are briefly discussed here, together with the 
little new material available. 

Items in the early form weie analv/ed by Brigham and Findlcv, re¬ 
porter! by Kinter (.j2r,), who calculated the biserial coefficient of correla¬ 
tion between items and total score*; the correlations ranged from —.02 
to .53. Perl laps this partly explains the relative unreliability of the first 
edition of the test, which clearly contained de*ad wood. The revision used 
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Brigham and Findley’s data (522:14) to select the too best items, thereby 
correcting this defect. When old records were checked with the new key, 
greater differentiation was found. 

Intelligence test scores, correlated with scores made on the original 
form by ait students (141) and by college students (248), were only 
slightly if at all 1 elated to aesthetic judgment (.28 and .05). These find¬ 
ings agree substantially with Meier’s. 

Spatial visualization test scores might logically be expected to be re¬ 
lated to ait judgment stoics, since* the aesthetic judgments involve the 
ariangemcnt of objects in space. Brigham and Findley (in Kinter, 425) 
found a conelation of .57 with the College Entrance Examination Board 
spatial test, showing that the two aptitudes do have something in com¬ 
mon. Unfottunately no other interconelations of such tests have been 
located, although the data for their computation have been available 
(15). A factoi analysis of a batterv of tests of these two types, plus others 
of art information, perceptual ability, etc., might throw considerable 
light on the natiue of ait judgment. 

Aitistic judgment as measured by the McAdory test correlated only 
■37 ( 9 ° 7 ) ail( f -27 (159) with the Meier-Seashore Test, a difficult finding to 
explain. I he Eeweien/ Test of Fundamental Abilities in Visual Art is 
1 elated only to the extent of .55, but this is not surprising in an ability 
test: it is, in fact, 1 ather gi at ifying as an index of \alidity. 

Art grades were 1 elated to scores on the fust edition by Brigham and 
Findlcv, who found the surpiisingly high validity of . ]0 for a group of 
50 students at Coopci Union but concluded, according to Kinter (J25.G1). 
that the test did not ha\e sufficient disc 1 iminating value—perhaps be¬ 
cause of the inclusion of poor items. No data are available for the new 
form, for which the\ should be at least as good. 

Ratings of neatiae mtistu ability have been somewhat more exten¬ 
sible used as a ciiteiion of the validity of the Art Judgment Test. Car- 
roll (139) found a cone lation of . }o between these two variables; Morrow 
(5 j5) found a validitv of . j8, and cited one by Jones of .69. Apparcntlv 
the test has considerable value in selecting the students who manifest 
promise in their ait courses. 

The differentiation of ofcupational groups by means of the Art Judg¬ 
ment Test has been demonstrated, primarily with students but to a lesser 
extent with artists and art teachers. The manual lor the fust edition 
shows that art teachers made higher scores than art students or students 
in genet al, but no c ritic al ratios were computed. Enrich and Carroll 
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(242) found that art majors ranked 8.14 points higher than other college 
students on the old lorni, which seems especially important in view of 
their finding that training had no effect on scores; the dillcrcnce was 
statistically significant, the groups large. Barrett (45) confirmed this find¬ 
ing with the revised test, art majors at Hunter College scoring six points 
higher than non-majors on the average, a difference which was significant 
at the one percent le\cl. More helpful than am of these data would be 
correlations between pre-training scores and success in art work, but no 
such data are a\ailable. 

Vocational satisfaction has not, apparently, been related to art judg¬ 
ment in any studies. 

Use of the Meier Art Judgment Test ni Counseling and Selection. 
The evidence concerning the Meier Art Judgment Test indicates that 
it measures an ability which \aiies from person to person, is iouncl in a 
higher degree among artists than among non-artists, is possessed b\ some 
untrained persons in a \er\ high degree, is distinct horn intelligence and 
only moderately related to spatial \isuali/ation, is not much influenced 
by training in late adolescence, and is related to success in ait tiaining. 
This ability therefore seems to be an aptitude in the- nanower sense of 
the tenn. 

Development of aesthetic judgment appears to continue well into 
adolescence, making age norms desirable’. Just when development begins 
to level oil is not cleat, however; as the abilitv is a telativelv complex 
one, it may be safe to assume that levelling off takes place in the late 
teens or early twenties. Carroll’s work suggests that devolpment is mote 
a matter of maturation, at least in late adolescence, but this question 
needs further investigation. 

Occupations in which aesthetic judgment mav be important have un¬ 
fortunately not been extensivelv investigated. Meiet’s efforts having bee n 
absorbed in the study of othet pioblems. That aitisis excel in it lias 
been demonstrated, but the writer knows of no data which show the 1 iole 
which it plays in other fields, such as clothing design, dramatic pioduc- 
tion, architecture, and landscape gardening. 

In schools and colleges the Art Judgment Test should be* useful as a 
means of locating students who may have special talent and deserve 
special opportunities for artistic training, special attention in art course’s, 
and encouragement to capitali/e on extra-cutricuiai opportunities for 
the development of their talent, whether lor vocational or for avocational 
purposes. It can also be useful as a selection instrument in art schools. 
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although at this stage, as in counseling to a lesser extent, the evaluation 
of artistic production often yields more helpful information. In judging 
a client’s or applicant’s art work it is necessary that the judge be not only 
an artist, but an artist who is used to appraising the work of beginners 
in the light of the amount of training they have already had. When, for 
special reasons, samples of a counselee’s work are not available, it may be 
desirable to administer a worksample test of artistic ability such as the 
Lewerenz Tests in the Fundamental Abilities of Visual Art or the 
Knauber Art Ability I est, both of which were designed to measure crea¬ 
tive ability in an (described below). 

In guidamc (cjitcis the use ol the test is similar to that in schools and 
colleges, whet hot lor counseling purposes or lor selection in connection 
with training piograms. It has little place in the evaluation of employ¬ 
ment applicants, as these are noimally already trained in art and can 
better be judged by their woik, unless an especially important position 
is to be filled and it is desired to have a comprehensive study of the 
applicant. 

The business and indust) ml use of the Art Judgment Test is extremely 
limited, lot leasons just gi\en. It may, however, prove quite \aluable 
at times when nonai tisti(ally tiained personnel are to be selected 01 
tiansletied to A\oik in which ability to judge good form and composition 
are impoitant, lot example, in certain retail trade jobs of a merchandis* 
mg t\ pc 

Cki.Aiivi Artistic: Ability 

As was mentioned cat Her in this chapter, tests of so-called creame 
artistic abilit\ aie in reality worksamples desised to measure the sub¬ 
ject's abiht\ to construct a good artistic design 01 to utilize the concepts, 
\ocabulai\. and tools ol the artist. As such they hardly belong in a dis¬ 
cussion ol aptitudes in the narrower sense of the term, but logically 
should be taken up in connection with custom-built tests or. if there 
were enough such to warrant such a classification, with worksamples. In 
this case it seems more practical, howc\cr, to treat these tests in the 
chapter dealing with another special aptitude, the importance of which 
is seeminglv limited to the same occupations. If Meier makes available 
the promised battery of art tests, a change in the oiganization and loca¬ 
tion of this niatei ial will be warranted, giving it a section in the chapter 
on custom-built batteries of tests. 

The two worksample tests dealt with here are the Lewerenz and the 
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Knaubcr. Because oi the similarity of content and the lack ol subsequent 
studies of their validity, both are briefly discussed, the Lcweren/ being 
given more space as it is a more manageable test. 

The Lewerenz Tests in the Fundamental Abilities of Visual Art (Cali¬ 
fornia Test Bureau, 1927) 

Applicability . The test was designed as a measure ol creative artistic 
ability, for use in school systems. Ii was standardized on children in 
grades 3 through j 2. It can also be used with young adults who have had 
no further artistic training. 

Contents , Administration, and Scoring. Because of the independence 
of the separate parts of the test, they arc best described in detail indi- 
\ idually. 

Test 1. Fifteen sets of drawings with four pictures to a set (multiple- 
choice), including bowls, friezes, cornices, etc., \ar\ing from good to bad 
in proportion and balance. T wo parts, recognition of proportion in 
standard lorms, and pioblems of abstract proportion and balance. Time, 
to minutes. Score equals number right. 

Test 2. Ten sets of clots in varying numbers. The subject is told to 
draw any subject he chooses, using all the dots in each space with sttaight 
or curved lines, then to write one word in the space to indicate what he 
has drawn. T he arrangement of dots \aries, to pet mil formal and fane i- 
ful interpietations. T ime, 20 minutes. Score* is obtained by compaiing 
drawings with six giaded rating sheets. 

Test 3. Ten drawings ranging from simple to complex. The* subject 
is recjiiiied to indicate omissions of shades and shadows, the light being 
considered as coming from the left. Time, 5 minutes. Score is the number 
right. 

Test 4. A vocabulary test, utilizing the matching method in five ten- 
word sections dealing with materials, craft processes, graphic processes, 
drawing terms, and pic tin es. T ime, 20 minutes. Scoie is the number right. 

Test 5. A black vase form mounted on a white background is exposed 
to the subject for two minutes. After it is removed the subject is in¬ 
structed to draw it from memory, on a test blank which shows the top 
and bottom of the vase with a vertical line through the center. Time, 

5 minutes. Scoring is by a stencil. 

Tests 6, 7, and 8 deal with ability to analyze problems in perspective: 
cylindrical, parallel, and angular. The subject may use a ruler in correct- 
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mg incorrectly drawn lines in each of the three tests. Time, 5 minutes for 
each test. Score is the number of correct responses. 

Test 9. A color (hart with six known colors at the top; below are 46 
“unknown” variations divided into four sections. The initial letter of 
the known six colois is used to indicate the one predominant known 
color in each of the unknowns, by means of a six-response type multiple- 
choice technique, l ime, 20 minutes. Score is the number correct. 

\anil. s. Norms are available lor elementary grades, junior high 
school, and senior high school, based on an unselected group of 1 100 
pupils. Separate norms mav be used for part scenes. 

Sfandm dization mid Initial Validation . As has just been stated, the 
tests were standardized on a supposedlv tvpical group of school children. 
Various comparisons were made by the test author with art students, the 
cortelation with art grades being . jo and the rank correlation between 
pet fo» mance and pi edit ted ability .('>4. In subsequent studies, summarized 
In Kin ter, Leweien/ found a couelation between his tests and a test of 
intelligence of .i;,*, foi a group ol over 1000 children. Sex differences 
were also reported, girls being superior to boys in all but originality 
and abilitv to anaK/e. 

Reliability. A retesting of 100 pupils in grades 3 to 9 after an intenal 
of one month Melded a teliabilit\ coefficient of .87 (manual). No other 
such data haw been located. 

Validity. Few studies base been made imohing the Lewerenz tests 
by persons other than the test author, which is to be 1 regretted in \iew of 
the 1 fact that it is the more manageable of the two well-known tests de¬ 
signed to measure creative artistic ability. Wallis (907) correlated the test 
with the Meier-Seashore and McAdory tests, finding correlations of .54 
and .-gS. higher than that between the last two, which are supposedlv 
mote similar (.97). 

Vse of the Lncercnz Tests in Counseling and Selection. From the 
above material it is clear that the Leweren/ tests are measuring, with 
considerable reliability various factors which are rather distinct horn 
intelligence, which have a substantial relationship with achievement in 
art, and which van with age and sex. An analysis of the content suggests 
that these factors have to do with visual and creative artistic abilities, but 
too few relationships have been determined, and no factor analyses have 
been carried out, to enable one to draw adequate conclusions. On the 
basis of the available evidence, however, one mav tentativelv conclude 
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that the tests have practical value in selecting students with sufficient 

promise for further training in art. 

Because comparatively little is as yet known about it, scores on this 
test must clearly be supplemented by a variety of other information, such 
as Meier scores, data on art training and interests, intelligence, ratings 
of art work, etc. 

The Knanber Art Ability Test (I)istributoi: Psychological Corporation; 
1927, revised 1 935) 

Applicable at or above the junior high school level. Contents are chaw- 
ings in which the subject creates or completes thawings or locates errors; 
they yield seven measures of presumed components ol art, such as long 
and short-time memory, observation, accuracy, creative imagination, 
ability to visualize and to analy/c. etc. Administration invohes no time 
limit, but the test normally takes about tlnee hours. Drawings are rated 
on a three-point scale. Xonns are based on r>!)() students bom 7th grade* 
to university sophomore, are in terms of grade percentiles. ,S tandaulna¬ 
tion and validation are described in the manual and in an article 1>\ 
Knauber (139). The present form is stand.udi/ed on 1 gfffi cases, after- 
trials of other forms on 300 art students and r,-,o art students. Art 
students make a median score of 93 compaied to that of 32 for non-art 
students; art teachers make a median score ol 123 contrasted with Or for 
other teachers. With 92 art students as subjects, the con elation with the 
Meier-Seashore Test was .57, with the Lewerenz. which it should resemble* 
more closely on a priori grounds, .(>]. Reliability'. Retest reliability altei 
one year was .96 (43K), l)\ the split-hall method it was .93. Use m (ounscl- 
ing and selecting seems justified by the hut that the* test distinguishes 
between the various levels oi artistic ability as shown in the* group cliflei 
ences reported. No data are available, however, on the* efle*ets ol training, 
which might account lor these difference's, except the evidence showing 
high reliability over a period ol one u*ar. I he test does appear to 
measure creative ability, if the nature ol the items may be taken as 
evidence ol validity. However, in view ol the present limited knowle dge 
of the test, scores must be used with considerable caution. Those making 
high scores may, if other evidence such as art judgment tests, intelligence, 
interests, and ratings of art work, is favorable, be encouraged to continue 
training in art; cases making low scores should be investigated further 
before recommendations arc made. 



CHAPTER XIII 


MUSICAL TALENTS 


FO TURN our attention to musical aptitudes in this chapter, as to 
artistic in the preceding chapter, is to risk abandoning the logic of the 
oigani/ation oi the book as a whole. For in this text the focus is first 
on psychological characteristics, whether they be aptitudes, skills, 01 
traits, then on the means of measuring them, and finally on the voca¬ 
tional and educational significance of the ability or trait being measured. 
The use ol the terms ‘‘artistic” and “musical” implies an orientation 
which is primal ily occupational. Useful as this latter approach is when 
judging a person’s fitness lor a specific occupational field or when devis¬ 
ing orsclecting a batten of tests lor a single area, it is not, on the whole, 
as help!til as the psychological approach is to the counselor who seeks an 
understanding ol the person with whom he is working and who hopes, 
thiough a shaiing ol that understanding with the client, to help him to 
make appropriate \ocational plans. In this chapter as in the preceding, 
however, the locus on the occupational field is brief and introductory to 
the discussion ol specific aptitudes which happen to be important pri¬ 
marily to one family of occupations. The aptitudes, in this instance, are 
physical capacities which have been found to be fundamental to success 
in music; they include such abilities as sense of pitch, sense of rhythm, and 
sense ol time. I hey are treated in some detail below, in connection with 
the* Seashore Measures ol Musical Talents. 

Music being a creal be aesthetic occupation, it seems likely that many 
ol the- traits which hast* been shown or are presumed to be of importance 
to success in artistic occupations would also play a part in musical suc¬ 
cess. Seashore has studied these in an early monograph (690) and dis¬ 
cussed them in his more recent general treatise of the psychology' ol 
music (1)93), and the list does indeed tend to parallel that of his colleague 
Meier in the field of art. Manual skill is considered necessary for instru¬ 
mental woik in music, as lor the use of tools in art; energy output and 
perseverance is deemed important in music too. with its requirement of 

319 
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hoiu after hour of routine practice: cieative imagination is presumed to 
play a part, not only in the composition ol new wot ks but also in the 
inteipretation of existing woiks; and emotional sensitivity may be 
thought to be important in both creative and interpreti\c work, il the 
musician or the artist is effectively to portray feeling and to play upon 
the emotions of others. Intelligence may he 1 assumed to he increasingly 
important at the higher levels of musical endeavor: while il may not be 
important in a blues singer, Stanton’s studies at the Kastman School of 
Music (y.jS) showed that intelligence is important in mastering the mote 
abstract aspects of music. And, finally, Seashoic’s investigations 
confirmed bv those of Stanton and others, have shown that the physical 
capacities measured by his tests are basic to musical success. 

As the preceding paragraph implies, the only factors presumed to be 
important to success in music which have satisfactorily been demon¬ 
strated to be related to achievement in that field are intelligence and 
Seashore’s psychophysical capacities. The writer has seen no investiga¬ 
tions other than the tentative eaily studv by Seashore* (hejo) which demon¬ 
strated that musicians aie superior to the general population in manual 
skill, energy output, or creative imagination, or that scores on measure's 
of these factors arc correlated with musical success. Thcic is some evi¬ 
dence which suggests that musicians may lie more sensitive emotionally 
than the geneial population, for the writer (791) found that male* 
amateur musicians who plaved in symphony ore best 1 as vveie significantly 
more likely to be unmarried, dissatisfied with their social life, and dis¬ 
satisfied with their occupations than were other men of the same age 
and socio-economic: status. If maladjustment is a sign of emotional se nsi¬ 
tivity, then the hypothesis is perhaps validated; but it is possible that 
there is such a thing as emotional sensitivity without maladjustment, and 
that it is sensitive persons who are not maladjusted who make the best 
musicians. In any case, the writer’s subjects vveie amateur, not profes¬ 
sional, musicians. It cannot therefore be said that it has been demon¬ 
strated that emotional sensitivity plays a part in success in music. 

In view of the demonstrated importance of Seashoie’s physical capac¬ 
ities in musical success, the infrecjuency with which they play a part in 
other fields, the lack of evidence concerning the significance of other 
abilities in music, and the general rather than specifically musical nature 
of the other characteristics which are presumed to affect success in music, 
it seems legitimate to discuss Seashore’s tests and the capacities which 
they measure under the heading of musical aptitudes or talents. Other 
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similar tests, described by Greene (309:425-438), are not dealt with here 
because they have not been so thoroughly studied. 

The Seashore Measures of Musieal Talents (RCA Manufacturing Co., 
1939; since 1949, Psychological Corporation) 

Idle initial work on the measurement of physical capacities which 
might be important to success in music was begun by Seashore before 
World Whir I. As in the case of other psychologists who were then devel¬ 
oping new measuring instruments, he continued his work during the 
war, applying it successlully to the selection ol submarine detection men 
in the Navy. The first edition of the test lor general use in musical guid¬ 
ance and selection was published soon afterwards, in 1919. As a pioneer 
in the* study of the psychology of music, and aware, apparently, of the 
value of loc using his research energies on one promising field, Seashore 
continued to wot k with his tests, attracted graduate students who carried 
out additional studies, and found financial support to press his and his 
students’ investigations. As a result, his laboratory at the State University 
of Iowa became the most active center for tesearch in the psychology 
of music and in the prediction of musical success in the United States, 
and his tests ate, together with the Stanford-Binet and Strong's Yoc a- 
tional Interest Blank, among the best known, most widely used, and most 
thotoughly understood instruments in the field of psychological meas ue- 
ment. The tests wete revised and a second edition published in 1949 

(f ,r > 2). 

For these reasons, the tests arc* treated here in some detail, even though 
the frecjuency of their use in counseling is somewhat limited because of 
the relatively few persons in musical occupations. Were it not for this 
fact, they would be dealt with at much greater length, as an illustration 
ol the thorough type of wot k and multiple approaches which are needed 
in making vocational tests useful. 

Applicability. The first edition of the Seashore tests was designed for 
use at any grade level, hom the first grade to adulthood. Because of the 
eflects of motivation and attention on the test scores, however, the revised 
manual recommends that the tests be used beginning with the fifth grade, 
that is, with children of about ten years old. T his is acceptable to Sea¬ 
shore as a minimal age because it is also early enough to make possible 
set ions planning lor musical training if it seems warranted. 

Flic norms for the revised tests indicate that scores tend to increase 
somewhat with age, for there is a steady increase in the means from 
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grades 5 and b to adulthood. Although these differences aie slight , 
amounting 1 to only one or two points, they might conceivably be inter¬ 
preted as showing that the abilities in question are still matin ing. The 
ranges of scores are the same, however, at the different age levels, and the 
1 eligibilities are somewhat higher in adulthood than in adolescence (me¬ 
dian r = .82 in adulthood, .78 in adolescence), facts which suggest the 
\alidity of Seashore’s contention that the lower means of younger people 
arc due to problems of concentration, attention, and similar administra¬ 
tive factors. If this is so, it becomes important to take especial pains to 
establish good rapport when testing school-age rhilthcn and to test in 
two 01 three sessions. Seashore (Chjo) and Stanton (718) have shown that 
training and experience, e.g., three years in a school ol music, do not 
influence scores. The tests are therefore as applicable to adults as to 
children, and vice versa. 

Content . The tests consist of two set ies of thtce double-faced twelve- 
inch phonograph records each. Series A is made up ol wicle-iange tests 
suitable lor siu\e\ or screening pm poses with betel ogeneous gioups. 
while Series P> has a higher base and “ceiling” in ordci to make it mote 
diagnostic at the' higher abilit\ le\els and with music students. The six 
capacities measured by either set ies aie Pitch, Loudness (lot met 1 \ called 
Intensity), Time, Timbre, Rhvtlnn and Tonal Memoiv. The 1 <)i<) edi¬ 
tion contained a test ol Consonance, lor which Timbre was substituted. 
No verbal description can comey an adequate idea of the specific con¬ 
tent, but it may he lp those who do not ha\e access to the- tests to dese 1 ibc 
the Pitch Test, for purposes of illusti ation, as a series of pahs ol musical 
notes. One member of each pair of notes is higher than the other; some¬ 
times the higher note comes first, sometimes last, in the' pah; in Lite) 
pairs the- two notes are of more neatly the same pitch than in the first, 
the note's be coming more and more alike in pitch as the te st pi ogresses. 
As a result, a point is )cached at which it is \hlually impossible to decide 
which note is higher. This point comes early in the test for those lacking 
in pitch discrimination, late in the test for those who excel in it. T he 
other five tests are built on similar principles. 

Administration and Scot mg. T he manual lor the 1939 edition gives 
quite adequate directions for administering the tests, which require 
about one hour. Several points deserve special emphasis, however, be¬ 
cause of the unusual nature of the medium. The records used must be 
in good condition, neither scratched nor warped. So also should be die 
record player, adjusted to play loud enough to be heard tfnoughout the 
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loom, and at the standard speed of 78 r.p.m. As the records are monoto 
nous, capturing the inteiest and retaining the co-operation of the sub¬ 
jects is especial]y important: in a paced test such as this a little wandering 
ol the attention can spoil a test score. The manual recommends that 
examinees lean slightly fonvurd in a poised position which facilitates 
concentration. Most unusual in the testing procedure is the desirability 
ol demonstrating the tests by playing pails of each ret01 d beloie testing; 
the examiner gives the directions, then plays a few items near the begin¬ 
ning ol the recoul, asking all examinees to respond orally, and permit¬ 
ting time lor (jiiestions. He plays a lew more items nearer the end ol the 
ice old, again asking (or group responses and allowing questions. This is 
to lamiliaii/e all subjects with the unusual type ol test item, and to make 
it truly a measure ol capacity. It might be objected that the test is spoiled 
b\ lamiliari/ation with the specific contents, but experimentation has 
shown that practice does not vitiate the test if the excerpts from the 
tec olds aic not consecutive (9; 2 j 6). Responses, in terms of “high, low,” 
“stiong, weak,” 01 similar lei ms, aie lecordcd on simple answer sheets 
which can be pin chased 01 mimeographed; scoring is clone by comparing 
1 espouses with a kc\ 01 a homemade stencil, and counting the number 
ol collect answcis. The tc-sts can be ghen more than once for the sake 
ol gieatcr 1 eliahility, and the scenes axeraged, which single fact more 
than any otliei brings out the lundainental difference between this and 
other aptitude tests! 

Xonns. Decile 1101ms aie pioxidcd for 5th and 6th grade pupils, 7th 
and 81I1 gracleis, and adults, loi Seiies A tests, and for adults onh lor the 
Series B tests. \o separate high school norms weie deemed necessaty, 
because ol the small dillercnces, abeady lelerrcd to, between 8th graders 
and adults. I he noimatixe tables do not indicate the number of cases on 
which the stanclaidi/ation was based, but the table of reliabilities in the 
manual makes it clear that the numbers in each grade group xai ic cl 
from about 1000 to 1700 pupils, depending upon the test, and fiom 600 
to 1100 adults, the smaller mmibeis ol cases being lor Series B. Thete is 
no indication as to bow the 1 samples were se lected; as Series A is designed 
as a survev test it should be a ctoss-seclion of school children and adults 
in genetal lor that series, and, for Series B, the diagnostic test, a group ol 
adults stitching music. The manual is defective in not making the 
natuie ol the samples explicit. 

Standardization and Initial Validation . Adequately to describe the 
e xte nsive and intensive standardization and validation studies earned out 
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with the Seashore music tests by Seashore, his students, and other psy¬ 
chologists interested in music would recjuiic far mote spate than abilities 
of such limited occupational significance met it in a text such as this. 
In lact, even the full-sized volume in which Seashore discusses his twenty- 
live years of work with the tests is tantali/ing to a scholar because ol its 
generality and lack of specific data on what was clone and with what 
results. For present purposes, it seems best to since) a lew ol the studies 
of the validitv ol the tests, referring those interested in theii standardiza¬ 
tion to the monographs by Seashore and his colleagues (bpo,bp:.',bbi>). 

Reliability. Farnsworth (2 jb) reviewed the studies ol the reliability 
of the old form ol the tests in ippi, SS in all, and concluded that only 
the tests of pitch and tonal memory were sufficiently reliable for use* with 
individuals. Drake (210), for example, iound that the better tests had 
reliabilities of about .8b; these were odd-even reliabilitv coefficients, 
corrected by the Spearman-Brown formula, and might be spuriously high 
in a test which is paced and thcrcloie somewhat speeded. However, 
Larson letested children and adults with substantially the same 

results, f'he revised battery has higher reliabilities, on the whole; lor 
Series A they range from .bp to .84 at grades 5 and b, bom .(ip to .87 
for 7th and 8th graders, and from .b:> (the next higher is .7]) to .88 for 
adults. The median reliabilities at the same levels are .78, .78-,, and .8i». 
For Series B the coefficients are somewhat lower: .70 to .8p, with a median 
of .735. Tonal mentor v is the most reliable test in the new battery, with 
pitch and loudness about ecjually good, while timbre, which replaced the 
unsatisfactory test ol consonance, is the least reliable. It seems surprising 
that what appear to be immutable physical capacities aie measured with 
less reliability than some more subtly psvclrological factors; perhaps this 
is due to the large number ol hue discriminations which must be made, 
and to vagaries ol attention, rather than to the nature of the trait or 
delects in the tests. 

Validity. Most studies ol the validity of the Seashore tests have been 
concerned, as one might expect, with the relationship between scores and 
variables such as intelligence, music grades, and success as a musician. In 
the revised manual and related publications (f>bi»), however, Seashore has 
taken a new and diflerent position. Although the validation studies have 
tended to demonstrate a considerable degree of predictive and occupa¬ 
tional clillerentiating power, he now seems to feel that the* validity of 
the tests lies in their accurate measurement ol basic capacities which 
are utilized b) musicians, rather than in the degree to which they are 
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con elated with success in musical training or performance. The differ¬ 
ence' may seem a hue one, but it may be made clearer by explaining that 
in the hitter approach one correlates test scores with grades 01 ratings, 
whereas in the ionner one analyzes the performance of musicians in 
Older to asm lain to what exte nt they reveal high degrees of pitch dis¬ 
crimination, sense o[ timbie, etc. To the writer, this seems like a reversal 
oi the natural order ol things, lor surely one should analy/e the job to 
ascertain what lac tors semi to be important in it, then construct tests to 
measure* them, and then, as validation of both the job analysis and of 
the* tests, cot relate score's on the tests with criteria of success on the job 
11 there* is no relationship between the measures and success, it matters 
little* what the* analysis showed. Perhaps Seashore did neat intend to con¬ 
vey the impression that he had thus reversed his approach, or perhaps it 
was simplv that, ha\ing found objective methods of analyzing the per¬ 
formance's ol musicians ((192). his interest in the techniejue caused him 
to lose sight ol its place in the prediction, as opposed to the analysis, of 
musical performance. Be this as it may, there are a number of helpful 
studies ol the pi edit the value ol the musical aptitude tests in their older 
lorm; computable* studies ol the* essentially similar new form have \ct tc 
be published, research ol this t\pe having been interrupted during 
Weald War If and Seashore basing been retiled. 

Intcu orrrlafmn s ol the* original six tests were reviewed bv Farnsworth 
who found them to ha\e a median intercoi i elation ol . j8 for 
col lege* students and .25 for elemental y and junior high school pupils. 
This suggests that the* capacities measured bv these tests aie not as com¬ 
pleted independent and basic as Seashore believers them to be, suggestion 
appatenth eonlnmeel by Drake’s factor analysis (211) of the live best 
Seashore tests, the Kwalwassrr-Dvkema tonal movement test, and two 
ne*w tests, one of memory and one of 1 e tentix itv, which revealed one 
common factor and thiee group laetors underlying them. It may be, lor 
example, that senses ol pitch and ihuhm undei lie tonal memory. 

Intelligence has lcpcatedlv been found to have little relationship to 
Seashore scores. Farnsworth’s review (2 jfi) covered the earlier studies of 
this topic, sixteen in all, with a median correlation of .10, the range being 
— .08 to .45. 

Grades in music courses have less often been used as a criterion of suc¬ 
cess, peibaps because tlicv have not seemed sufficiently representative of 
musical abilit). Faison’s finding ol a correlation of .59 between composite 
Scashoie scores and grades in the first course in music theoiv at the East- 
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man School of Music seems rather high; a correlation of .31 between 
Seashore tests and grades in a college of music was reported by High- 
smith (370), which seems more in line with probability. Intelligence tests 
were found more useful in this latter study (r = .42), and were included 
in the Eastman School battery (7 j8). 

Ratings of musical ability have not yielded such satisfactory results. 
Murscll (559) reviewed such studies and drew the conclusion that the 
tests were invalid. In Hew of studies such as Stanton’s (see below), which 
have utilized objective procedures and have demonstrated consideiable 
validity in the tests, it hardly seems justifiable to make stub drastic judg¬ 
ments on the basis of data as subjective as ratings. Not onh have ratings 
generally been proved unreliable (810), but in studies such as those in 
question the subjects rated were all sufficiently able in music to be active 
students, a select group, thercbv narrowing the range of both ratings and 
scores and artifically attenuating the relationship. In such < in umstaiucs 
the making of ratings is mote difficult and the product theiefore less 
teliable than c\er. 

Completion of musical training seems a much mote objective critetion 
of success than rating, e\en when the effects of financial factors arc* recog¬ 
nized. Stanton (718) made a ten-year stuch of the Seashore tests at the 
Eastman School of Music in Rochester. Moie than 2000 euteting students 
were tested, and the test results weie not used but simple filed until 
criterion data were available tour years later. An anahsis was then made 
of the relationship between test scores and the completion of training in 
music. The results of Seashore tests were combined with intelligence 
(Iowa Comprehension) test scores and teachers’ ratings to pmvidc a 
“cumulative key” or overall predictor. It was found that bo percent of 
those who were rated “sale" risks on this basis had graduated in the 
normal amount of time, J2 percent ol those who were classified as reason¬ 
ably good 1 isks and 33 percent of the fair risks graduated, in contrast 
with 23 percent of the poor and 17 percent of the \er\ poor risks. The 
case histories of the high-scoring drop-outs were studied, in order to 
ascertain why the predictions based on test scores were not even better 
than they were; in these cases financial need, famih pressures, and other 
non-aptitudinal factors seemed to be sufficient cause. 

This study has been criticized by Murscll (560:233) because the predic¬ 
tive value of the Seashore tests has generally been assumed to have been 
demonstrated by it, whereas the often referred to evidence is actually 
not based solely on the Seashore tests. As Murscll pointed out, the data 
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were not presented in a wa) which made possible a definite evaluation 
of the predictive value ol the Seashore tests, although this could easily 
have been done. The value of the “cumulative key” may have been due 
largely to the intelligence test or to the ratings of previous music teach¬ 
ers. There is implicit in Stanton’s report, however, evidence to the effect 
that such data were available (718:68), and while it is true that if they 
were available they should have been reported, statements to the effect 
that the “lowest musical talent students were very short-lived in the 
school” should be taken into account. No correlational data have been 
located, but in an earlier study (7 17 ) it was reported that in the lour 
vears, 1923-26, the percentage of students making grades of A, B and G 
on the music tests rose from 79 to 92, and teachers’ estimates of student 
talent rose from 67 to 88 percent in the same categories. The indication 
is that the higher level ol talerrt revealed by the tests was confirmed by 
teacher evaluation. While the reports are to be criticized for their lack 
ol details from which generalization would be possible, it seems that the 
findings are not to be dismissed as completely as Murscll suggested they 
should be. 

()<< ujmtwnal difjnriucs were also studied by Stanton in a comparison 
ol the scores of professional and amateur musicians with those of be¬ 
ginning students of music and non-musicians. The former were found to 
be significantIv higher than the latter, result which, in view of other 
findings aheadv mentioned which showed that the test scores arc not 
aliened by naming or experience, demonstrates the ability of the tests 
to chlleientiate the more talented fiom the less talented musicians. 

P)rfnrntcs for dillerent tv pcs of music were ascertained by Fay and 
Middleton (2^0), woiking with 5j college students. Twelve musical 
selections were plaved to this group, and were rated by them for prefer¬ 
ences. Thev found that those who preferred classical music made higher 
scoies on the pitch and ihvthm tests than did those who preferred light 
classical music or swing, and also scored higher on the time test than did 
the swing fans. If conlirmed by other studies with larger groups and 
mote extensive sampling of musical tastes this would be an indication of 
the role of musical aptitudes, for apparently the most “high-brow” music 
does appeal more to those who are best endowed. It would be interesting 
to know what the relationship is between score on the Seashore tests 
and satisfaction with employment as member of dance and symphony 
oichestias, assuming that extraneous factors such as working hours, rates 
of pay, and employment stability could be controlled. 
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Use of the Seashoie Measuies o) Musical Talents in Counseling and 
Selection. From the preceding discussion it is apparent that the Sea¬ 
shore tests measine aptitudes which are relatively independent ol men¬ 
tal ability and of each other, and that these are physical capacities which 
mature by about age 15 and are not aflected by training or experience. 
Although it is possible that otic or two ol them are in teality a combina¬ 
tion ot some ol the others, the conclusion concerning their physical 
basis still holds. Seashore's recommendation that the scores be used sepa¬ 
rately, and ne\er combined, should be lollowed il musical capacities are 
to be meaningfully studied. 

The ot cu pational significance ol the Seashore tests is primarily musical, 
although they have* been lound to have some \alue in selecting persons 
for other jobs in which ability to make auditory discriminations seemed 
important. It is doubt!ul whether they will ever have guidance \allies, 
however, outside of the field of music. I11 it, it has been demonstrated that 
those who make high scores are more likely to complete training and to 
achie\e professional status than are those who make low scores. 

In schools and colleges these tests can be used to advantage to screen 
out students who have musical talents which are often unsuspected or 
undetected, thus making it possible lor them to develop their abilities 
lor their own enjenmerit and that of others, il not actually as a means ol 
earning a living. It the training atrd experience in music is lound to hold 
a challenge, and il the skill accjuired by the student seems ecpral to his 
promise, then it may be appropriate to consider vocational possibilities 
in music. In schools of music the tests can well be used as a selection 
device, with due recognition of the fact that what a student has done 
with his musical ability b) that time is at least as important a predictor 
of success as the ability itself. Talents may be a sine qua non, btrt tliev 
cannot be sufficient in and of themselves. 

In guidance and employment centos the tests probably have value 
only in cases in which the prospect ol further training is to be considered. 
Job seekers who are already trained can best be judged on the basis ol 
performance, that is, by means ol auditions. Those with some training 
but seeking more should also have auditions, in which the amount ol 
previous training is taken into account by experienced teachers of music; 
but in such cases the talent tests should be of value in checking up on 
the trainability of the candidate. It should probably be kept in mind, 
in such instances, that there are hierarchies in music as in other fields, 
and that some persons of lesser aptitudes may find ways in which to use 
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them whereas others with moie aptitude may find doors closed. For 
example, the potential night-club crooner may succeed with a modicum 
of talents assisted by good looks and a smooth manner, whereas a more 
gifted person who aspires to symphonic work may find himself outclassed 
in that field. 

Business and industry have so far apparently failed to find, or to 
attempt to find, any uses for these tests. Perhaps certain types of machine 
tenders, inspectors, and mechanics, who need to judge the operation oi 
defects of machinery by pitch or other auditory senses, could be selected 
partly by these means. The hypothesis would first need to be validated, 
and then experimentation might actually find that thresholds are low 
enough so that selection on this basis is unnecessary. When accident rates 
are ielati\ely high in such jobs, however, it might well be worth experi¬ 
menting with some of these te sts. A good automobile driver, for example, 
drives partly by ear, and responds at once to any change in the pitch oi 
the customary noises of his machine, thereby forestalling some types of 
nic e hanical failure. 



CHAPTER XI V 


CUSTOM-BUILT BATTERIES FOR 
SPECIFIC OCCUPATIONS 


THE realization of the fact that tests are likely to give better predictions 
when designed and validated for a specific rather than lot a general pui- 
pose has, for many \cais, led psychologists concerned with the selection 
of persons for professional training to devise batteries ol tests lor specific 
occupations. Some ol these have been designated as tests rathei than 
batteries, and they have in general been called tests of professional ap¬ 
titudes, hence names such as the Moss Medical Aptitude Test and the 
Ferson-Stoddaid Law Aptitude Examination. Blit they June actualls 
been batteries of tests even when combined in one booklet, and tlicv li ne 
generally, but not always, been designed for use in selecting pi olession.il 
students rather than in counseling students or selecting employees. 

This latter point is an important one, for many school counselois 
lacking a sound foundation in psychological measurement expect, on 
hearing of the existence of an instrument such as the Medical Aptitude 
Test, that they will find it invaluable in counseling their students or 
clients. In general, those who press the matter are disappointed, for they 
often find that the desired test is used exclusively by the professional 
schools which developed it as a selection device, or that it is disappoint¬ 
ingly like certain other familiar tests and therefore difficult to accept 
as a test of “medical,” “nursing,” or “teaching” aptitude. 

Whether available for general use, like the Engineering and Physical 
Science Aptitude Test, or restricted to use in professional sc hools, like the 
Medical Aptitude Test, batteries of tests lot specific occupations are 
nothing more than combinations ol existing types of tests of special ap¬ 
titudes, usually modified in order to give them some of the specific 
predictive and lace validity which is characteristic of the miniature- 
situation test. Thus the Engineering and Physical Science Aptitude Test 
is made up of parts of the Revised Iowa Physics Aptitude Test, the Moore 

3.SO 
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Test of Arithmetic Reasoning, the Bennett Test of Mechanical Com¬ 
prehension, and the Moore-Nell Examination for Admission to Pennsyl¬ 
vania State College; no special attempt was made to give the test face 
validity, presumably because mathematical and mechanical items have 
enough inherent face validity for technical fields. The Coxe-Orleans 
Prognosis Test of Teaching Aptitude, on the other hand, is made up of 
especially developed items, such as vocabulary, information, and judg 
ment. But these items were selected or devised so as to have special 
beating on education: the vocabulary deals with subjects with which 
people who are interested in teaching are presumed to be familiar, 
sometimes verging on “pedaguese”; the information is of a type which 
a would-be teacher might well be expected to possess; and the judgment 
items deal with classroom situations, behavior problems, and othei 
matters in the handling of which a prospective teacher should pre¬ 
sumably have some ability. They certainly possess face validity, although 
whether they reproduce the life-situation on a small scale is of necessity 
an open question until experimentally demonstrated. 

One other type of battery of tests for specific occupations has recently 
been developed, one by the United States Employment Service’s Division 
of Occupational Analysis and the other by the Psychological Corporation: 
these are respectively known as the General Aptitude Test Battery and 
the Differential Aptitude Tests, discussed in some detail in the next 
chapter. The principle underlying this type of test battery is that, since 
each mensurable aptitude is usable in a number of occupations, standaid 
instead of custom-built test batteries can be constructed and normal in 
such a way as to yield scores for a number of specific occupations. This 
is fundamentally the same concept as that underlying the Primary Mental 
Abilities Tests, but the approach is different. Instead of beginning with 
a series of tests designed to measure the currently known and isolable 
aptitudinal factors and proceeding to ascertain their vocational signifi¬ 
cance, as in Thurstone’s work, the procedure has been to develop tests 
which arc fundamentally the same as those which have been demon 
strated to have occupational significance, and then to obtain occupational 
norms for this uniformly developed and standardized series of tests. Since 
mechanical comprehension tests have proved valid for some occupations 
but not for others, such a test is likely to be included in such a battery 
and given a weight in the score for a given occupation which is pro¬ 
portionate to its correlation with success in that occupation. Sometimes, 
as in the case of the USES battery, the tests arc parts of well-known tests 
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or dose approximations of them; in other batteries, as in that of the 
Psychological Corporation, they utilize somewhat more original types 
of items designed to measure the same [actors or constellations of factors 
as existing tests; in neither case is any attempt made to measure pure 
factors, as in Thurstone’s batteries. The USES does not use all of the 
tests in its battery for each occupation, selecting, instead, the few which 
have the most predictive value for any one occupation; the Psychological 
Corporation, on the other hand, has planned its work around the battery 
as a whole. 

The multi-occupational approach of the last two test batteries repre¬ 
sents a new trend, dillerent from that of the professional aptitude tests 
discussed in this chapter. It results in one relatively brief scries ol tests 
with many applications, rather than in a collection of diverse test bat¬ 
teries, each usable only lor one occupational field. It is potentially much 
more valuable to vocational and educational counselors than is the pro¬ 
fessional aptitude test, lor, with one battery of tests, it becomese possible 
to explore a great variety of occupational possibilities. It takes time to 
accumulate occupational norms for such a battery of tests (the General 
Aptitude Test battery came into tentative practical use by the USES only 
in 1917, after nearly a decade of work, and the Differential Aptitude lest 
Battery, with the expenditure of $75,000, is just beginning to develop 
occupational norms); it takes even 11101 e time to develop special batteries 
for a number of occupations. But it is also 11 ue that special occupational 
batteries are likely to have greater immediate validity for selecting 
students or employees than genetal aptitude test batteries, because ol 
their miniature-situation elements and the ir custom built character; these 
advantages are soon lost by the changes which take place in specific 
details, outmoding many miniature-type items, and by the variations 
from one employing agency to another unless continuous research main¬ 
tains the tests. For example, the writer developed a personality inventory 
for the selection of Air Force pilots during World War II (801), which 
had more validity than the standard personality inventories and tests 
which were tried out at the same tirrre; it was truly custom-built, with 
items phrased in the language of aviation cadets and content drawn 
from their wartime experiences, both actual and anticipated. But changes 
connected with the end of the war made this test currently useless as a 
personnel instrument. The obvious conclusion is that tests with custom- 
built. items are best for selection programs in which conditions are rela¬ 
tively stable and investments arc great enough to warrant the continuous 
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validation of existing tests and the constant construction of new instru¬ 
ments, but that for counseling purposes tests consisting of generalized 
items with occupational norms arc the only practical choice. 

The tests discussed in this chapter are largely custom-built and were 
designed for personnel selection. 'Those in the next are batteries of tests 
containing generalized items lending themselves to custom-built norming, 
designed primarily lor counseling or lor selection in programs unable 
to support a continuous program of test construction. 

As indicated above, tests ol so-called professional aptitude have almost 
invariably been developed for the* selection of students in professional 
schools. Professional training institutions invest so much in their students 
as to make selection essential; in a few instances they have been developed 
for the selection of other types of trainees or employees, but here also 
the investment in the trainee or wor ker has generally been large, as in the 
Air Force pilot-training program. The tests have generally been kept 
confidential in order to prevent coaching, being made available only to 
member 1 schools or official testing centers. Tests of this type are briefly 
described in this section, as the great majority of users of psychological 
tests treed no more than a knowledge of their existence and nature. A 
few tests of this type are available for general use, and while these are 
discussed at slightly greater length they are not treated in detail because 
most of them ha\e not been widely studied. Both types are taken up 
under the title of the occupation for which they were developed, the 
occupational titles being arranged in alphabetical order. 

Business Executives. Although little has been published on the sub¬ 
ject in psychological journals, a great deal of lime and money is currently 
being spent on the application of psychological methods to the selection 
of executive personnel. General discussions of the executive selection and 
evaluation services offered by consulting organizations have been pub¬ 
lished in the May-June, 194O, issue of the Journal of Consulting Psy¬ 
chology, but evaluative studies are lacking on this very important phase 
of personnel psychology. In general, there may be said to be five current 
types of work in executive selection and evaluation: 1) the development 
of custom-built batteries of tests such as the Clecton-Mason Vocational 
Aptitude Examination and the U.S. Civil Service Commission’s exper i¬ 
mental battery, discussed below; 2) the validation of standard tests for 
this particular purpose, as in the University of Minnesota’s College of 
Business Administration project also discussed below; 3) the development 
of single tests for executive interests or other traits, best illustrated by 
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Strong’s work with executives and public administrators, mentioned 
below, and discussed in connection with that inventor); 4) the clinical 
use of interviews and tests as commonly done by consulting psychologists, 
considered in this section ; and r } ) the use of clinically evaluated situation 
tests as developed by the British War Officer Selection Boards and carried 
further by the U.S. Office of Strategic Services for the selection of per¬ 
sonnel lor critically important assignments, also considered in this section 
despite the fact that it has so far not been written up as a procedure for 
the selection or evaluation of executives in business and industry. 

The Cleeton-Mason Vocational Aptitude Examination (McKnight 
and McKnight, 1947), is designed to measure aptitude for four types of 
business activity, clciical, accounting, administrative, and technical. It 
is one of the few tests which purport to measure aptitude for executive 
work; it consists of eight subtests, the contents of which measure general 
information, arithmetic reasoning, analogic reasoning, reading compre¬ 
hension, interest (as in Strong’s), personality (as in Bernreuter’s), vocabu¬ 
lary, and ability to estimate such things as the number of cars in the 
United States. Although the authors have wiitten a monograph on 
executive ability, in which they have analyzed the nature oi the execu¬ 
tive’s task in a helpful manner, data on the validity of the test are so 
lacking as to make the test itsell ol little value in vocational counseling. 
The purposes it might serve are probably better served at picscnt by bat¬ 
teries of tests, such as the Otis Tests of Mental Ability, the Minnesota 
Clerical Test, and other tests ol special aptitudes which have been lather 
thoroughly studied, except perhaps when the test items are completely 
tailor-made. 

A battery for the selection of public administrators has been developed 
by Bransford, Mandell, and Adkins of the U.S. Civil Service Commission 
(117,505), utilizing two standard tests of intelligence (the A.C.E. 
Psychological Examination and Thurstone’s Estimating Test) and 
custom-built tests ol cuirent events, data interpretation, administrative 
judgment, and knowledge of agency organization and personnel. The 
criterion oi success was a combined rating of administrative effectiveness, 
the average number of rateis per employee being four. The top manage¬ 
ment (.¥6,200 to §10,000) group consisted of 20 persons; for this group, 
the correlations between criterion and A.C.E. were .64, Current Events 
.64, Interpretation of Data .65, and Administrative Judgment .68; other 
validities for this group were low. Eor the staff group (63 specialists at 
§2300 to §7500) the validities were .50, .26, .41, and .49. The multiple 
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validity coefficient for the staff group (the only one large enough for 
computalion) was .55. These data suggest that truly custom-built tests 
of executive ability may have considerable validity, but this batters 
cannot >et be considered to have been validated, in view oi the fact that 
the persons tested weie already on the job at the time of testing. Its 
validity can be considered established only alter applicants 1 or employ¬ 
ment have been tested and followed up. This is especially true of custom- 
built batteries, some of the items of which may be more readily handle d 
after one has worked in the situation than before. But, as the test 
authors concluded, this preliminary work with the battery suggests that 
it may have merit and that further validation should be carried out. 

The validation of a battei'y of standard tests for the selection of stu 
dents of business administration at the University of Minnesota was 
written up by Douglass and Maaske (207). This battery was designed 
solely for local selection purposes, but the investigation does provide 
some suggestions as to what types of tests are likely to have predictive- 
value. l ire tests which showed the closest relationship to success in the 
college of business administration measured knowledge of social tcrm« 
(Wesley College "lest of Social Terms) and of business mathematics, 
with correlations with first-year honor point ratios of .56 and .47, re¬ 
spectively, and an R of .(kj. It need hardly be pointed out that success 
in training may be much more dependent upon academic ability (the 
verbal factor) than success on the job, and that the selection or upgrading 
of executives might requite a rather different battery of tests. 

Strong’s attempts to develop scales of executive interests (770,779) have 
shown that executives are not a homogeneous occupational group, but 
actually an extremely heterogeneous one, drawn from a great variety of 
fields such as sales, accounting, engineering, clerical, and skilled occupa¬ 
tions. Under these circumstances it seems probable that the traits which 
executives have in common are fewer and more difficult to isolate than 
those which subdivide the group. It might, for example, be easier to 
distinguish insurance executives from insurance salesmen, engineering 
executives from engineering technicians, or office managers from office 
clerks, than to distinguish executives as a group from a group of men 
in-general which includes insurance salesmen, engineering technicians, 
and clerks. Strong’s work has shown that, in the field of interests at least, 
what the salesmen, technicians, and clerks have in common is what the 
insurance executives and office managers have in common. The lines 
are drawn vertically rather than horizontally, the executive salesmen 
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being the most able salesmen, the executive engineers being the most 
able engineers, the executive office workers being the most able office 
clerks. In the field of aptitude, also, being an executive may be a matter 
of being superior in one’s field, rather than having notable characteristics 
which are common to all types of executives (abstract intelligence would 
be an exception to this statement, in that executives in a given field and 
in all fields could be expected to excel in such a general ability). 

A battery of .standard tests administered to 15 superior and 10 average 
executives of a film of consulting management engineers by Thompson 
(82O) is of interest as one of the few published studies reporting positive 
results. The tests used included the Wonderlic Personnel Lest, Michigan 
Vocabulary Profile Test, Cardall Test of Practical Judgment, Kuder 
Preference Record, Adams-Lepley Personal Audit, Beckman Revision 
of the Allpoit A-S Reaction Study, Guilford-Mat tin Personnel Inventory, 
and Root I-E Test. The criterion consisted of perlotmance records (not 
described) and ratings by partners; how reliable these' were is not stated. 
Differences between the superior and average groups, significant at or 
abo\e the 7 percent le\el, were found with the Wonderlic, Michigan 
Vocabulary (Gcnernment, Physical Science, Mathematics, and Spoils 
subtests), Kuder (Mechanical and Social Service), and Adams-Lepley 
(Firmness and Stability) tests. Both groups were found also to be above 
the c)grd percentile on the Kuder Persuasive scale. All of the teported 
differences favored the superior executives, except that on the Kuder 
Social Service scale: in this characteristic the average or less successful 
executives were at the 79th, while the more successful executives were at 
the 51st, percentile. These results portray the successful management 
e ngineer executive as superior to less successful partneis in mental ability, 
technical and governmental vocabulary, sports vocabulary, mechanical 
interests, firmness, and stability, and inferior in interest in social service. 
As Thompson’s groups were very small these conclusions are highly 
tentative; cioss-validation might change the picture considerably. Fur¬ 
ther studies of this type appear, however, to be vvoitli making. 

Tfie clinical use of interviews and tests is perhaps the most common 
method now used by consulting psychologists in the selection or evalua¬ 
tion of executive (and sales) personnel. Although it does not make use of a 
total score based on a test battery, this procedure is briefly described here 
because of its prevalence and because it constitutes one method of using 
tests. 

Flory and Janney (267) have listed five factors which experience has 
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led them to believe must and can be appraised in executive evaluation: 
intelligence, both abstract and concrete; emotional control, defined as 
ability to maintain steady output without emotional tension under 
varying and trying circumstances; skill in human relations, or leader¬ 
ship in face-to-l’ace situations; insight into human behavior, both one’s 
own and that of other persons; and ability to organize and direct the 
at tivities of others. Some of these traits can be rather effectively measured, 
intelligence, for example, by means of standaid tests and pet haps by 
tei tain Roischach indices. But others, such as emotional conti ol and 
insight into human behavior, have not as yet lent themselves to effective 
measiuement. The judgment of such qualities is a much more complex 
and uni ( liable procedure than the statement by Flory and Janney 
implies. 

I he piotedures used by the consultants in question consist of a de¬ 
tailed personal history secured in an interview lasting from twenty min¬ 
utes to two hours, “suitable objective instruments” to probe areas of 
adjustment, and a clinical interview lor the checking ol symptoms re¬ 
vealed hv the personal history and the tests. Fear, in part of another 
a rticlc in the same symposium (l>) mentioned comparable methods used 
by another otgani/alion, without going into details other than stating 
that they can be used only by a highly trained psychologist. 

This procedure is nothing more than that used by any well-trained and 
balanced user of tests for selection purposes: it consists of selecting and 
interpieting the lcsults of tests believed likely to throw light on signifi¬ 
cant aspects ol the applicant’s qualifications, gathering important sup¬ 
plementary data by othei means, and synthesizing them into a meaning¬ 
ful pictuie. But, in contrast with test procedures for many other types 
of woik, it is actually less than what is done in most personnel evaluation 
piograms. For in the best use of tests in personnel selection and evalua¬ 
tion the tests have been previously subjected to experimental validation 
lor the work in question, and are used because there is an objectivelv 
demonstrated iclaiionship between the test score and success in that job. 
whereas in the procedme under discussion few if any such relationships 
have been established and the additional clinical work is an attempt to 
make up by subjective procedures for what has not been done by objec 
tive techniques. Flory and Janney’s “suitable objective instruments” for 
piobing personality may be objective in form, and suitable in the best 
judgment ol a competent vocational psychologist, but the existence of a 
lelationship between scores on such tests and success in executive work 
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had not, at the time of their writing, been demonstrated. The personnel 
selection and evaluation procedure described by Flory and Janncy, Fear, 
and others is a clinical procedure which uses tests diagnostically but not 
prognostically; the predictions are based on clinical judgments and not 
on the known relationships of tests. 

To underline this fact is not to deny the value of current psychological 
methods of executive selection: as a matter of fact, they are probably 
superior to other presently available methods. It is merely to point out 
a major difference between the use made of tests in such programs and 
in most other selection or evaluation procedures. The reasons for this 
difference are clear: they lie in the cjusiveness of personality factors, in 
the primitive state of development which characterizes our present meth¬ 
ods of appraising personality characteristics, and in the fact that execu¬ 
tive selection is so vitally important that it justifies the time of the 
vocational and clinical psychologists who must make the clinical judg¬ 
ments involved. Subjective and even delcctive though these judgments 
may be, they represent the best available: informed guesses are prelerable 
to uninformed guesses, and better-informed to less-well-informed. In the 
equally complex problem of predicting success in pilot training, for 
example, it was found that judgments made in psychiatric interviews 
with aviation cadets had a correlation of only .27 with success in flying 
training, as contrasted with a validity of .Gf> for a custom-built and objec¬ 
tively scored test battery; the psychiatric interviews weie of little more 
than chance value, and much less eflective than the test battery, but if 
no such battery of valid tests had been available the weeding out of even 
a few failures would have justified depending upon the clinical judgment 
of the psychiatrists. The suggestion emerging from this discussion is that 
it would be well worth the while of organizations interested in the selec¬ 
tion and upgrading of executives to finance whatever fundamental 
research is a prerequisite to the development of better tests for the 
measurement of characteristics which may affect success in administrative 
and top-managerial work. 

Clinically evaluated situation tests used by the Office of Strategic 
Services have been described by Murray and MacKinnon (558) and by 
the Assessment Staff (33). In this work they were concerned with apprais¬ 
ing “the relative usefulness of men and women who fell, for the most 
part, in the middle and upper ranges of the distribution curve of general 
effectiveness or of one or another special ability,” and with assessing a 
number of “personality qualifications—social relations, leadership, dis- 
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cretion . . .” As it seemed that none of the conventional screening 
devices tested good will, tact, teamwork, freedom from annoying traits, 
leadership, and other social qualifications, special procedures had to be 
devised. In other words, the project had to develop methods of apprais¬ 
ing executive ability, since none were available; that the executive abilitv 
was to be applied in “cloak-and-dagger’' work is incidental, and should 
not blind civilian personnel workers to the possibilities of the methods 
tried. That they are also being used in the British civil service is further 
testimony to their general promise. 

The O.S.S. procedure consisted essentially of bringing about 18 can¬ 
didates to a house party for a period of three and one-half days. The 
activities of the house party were directed by a staff of psychologists, 
psychiatrists, and sociologists. Data were gathered by means of casual 
observations; standard tests of intelligence, mechanical comprehension, 
etc.; projective tests such as Incomplete Sentences, Thematic Appercep¬ 
tion, and the Rorschach used primarily to assess motivation and emo¬ 
tional stability; personal history interviews of an hour and one-hall; 
group situation tests, one requiring working with a team to accomplish 
a feat of physical prowess, another a discussion, in both of which leader¬ 
ship might develop, and some assigned leadership problems in which the 
examinee must lead his group; individual situational tests involving 
li ustration-tolerance and a stiess-interview; an obstacle course; tests of 
observing and reporting details; tests of propaganda skills as shown in 
die preparation of a pamphlet to disturb Japanese workers in Man¬ 
churia; psychodrama involving difficult social situations; debate in a 
convivial party; a sociomctric questionnaire concerning fellow candi¬ 
dates; and judgment of otheis as ie\caled in sketches of the five men 
known best during the three and one-half days. 

Data obtained bv these methods weie clinically evaluated by the staff 
subgroup responsible for the study of several candidates, and rcwoiked 
in case conference by the whole staff. About 20 percent of the 5,500 men 
and women thus studied were not recommended for duty; 1,200 of those 
who went overseas were followed up and evaluated by supervisors and 
three or four associates. The choice and collection of criterion data was 
not undertaken until late in the war, and convincing quantitative valida¬ 
tion proved especially difficult (33; Ch. 9). Despite these difficulties a 
x alidity coefficient of .39 was obtained for a sample of 31 candidates as- 
vgned to appropriate duties. The authors conclude, with some justifica¬ 
tion, that the true validitv of their procechne was probable between .jy 
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and .Go (33:424). Two important points tan now be made on the basis of 

this work: 

The first is that the possibility of following up employees and obtain¬ 
ing evaluations under even the most difficult circumstances is lathci 
conclusively demonstrated by the obtaining of evaluations of men and 
women who were appraised in this country and followed up in scat¬ 
tered combat areas; 

The second is that there are many devices for obtaining potentially 
significant and quantifiable personality data which psychologists have 
only begun to explore, making the field of personality measuicment a 
rich one in which to carry on research. Since executives play crucial toles 
in their organizations, and represent considerable imestment of company 
or public funds, the exploration of these possibilities should be well 
worth the while of business, industry, and government. 

Dentists. Research in the selection of students for dental schools was 
summarized in 1940 bv Bellows (65). Most of the batteties used consisted 
of standard tests selected because it was thought they would ha\e \aliclity 
for this purpose, but two included tests which wete developed specifically 
for dental selection. One was the Iowa Dental Ouahfxm g inanimation 
of the State Univeisity of Iowa (72.)), the other a batteiy developed at 
the University of Minnesota partly on the basis oi the' Iowa work (i»oS). 
The Iowa tests were: information on the development of the* teeth, 
reading comprehension (dental anatoim), memoty lor nomcnc Iat me, 
prcdental chemical information, predental /oological information, a 
worksample (trimming a plaster of Paris block to specification), and a 
paper-and-pencil test of spatial iclations. The c01 relations between scenes 
on the first five tests and theon guides in thirteen dental schools umged 
from .11 to .74, the average being .53; for the woiksample the c011cl.ition 
with giades in first-)ear technique eouises was .fin; that loi the spatial 
test was ..j 1. Several possible combinations of tests were used in the Min¬ 
nesota studies (noH), then validity \arying somewhat not only from 
battery to battery but also from year to year, the numbeis vaning from 
83 to 111. One battery consisted of piedental guides (r — . jr,), a me tal- 
filing w r orksamj)le (-53), the Iowa Visual Memory Test for Nomenclature 
(.40), the O’Connor Finger Dexterity Test (—.30), and the Iowa Spatial 
Relations Test (.52); the multiple correlation with total grades in dental 
school w r as .78, even when only the filing, memory, and dexterity tests 
were used. When laboratory (Prosthesis) grades were used as a criteiion, 
the Metal Idling lest (custom-built) had a \alidity of .fio while' that of 
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the Fingci and Tweezer Dexterity Tests (standard) was — .35 and —.43 
(high time scores are bad, hence the negative relationship). 

These studies show that grades in dental" school have been predicted 
with considerable success by means of batteries of tests, some of which 
weie constructed especially lor that objective. However, the: value of this 
appioach can be judged only by comparing its results with those of 
studies which have used standard tests weighted lor dental selection on 
the basis of local validities. Only Harris’ study (341) permits such a com¬ 
parison, made in a difleient school with a criterion (grades) which may 
have been more or less reliable than those in the Iowa and Minnesota 
studies: his multiple validity coefficient, using preclental grades and 
intelligence test as predictors, was .by, which is substantially lower than 
the -7<j obtained at Minnesota with a special battery. Whether or not 
the 1 additional validity justifies the extra labor of constructing the special 
battery depends, of ccruise, upon the expense ol the mistakes which result 
bom using an inferior selection procedure. 

r'nginccrs. Although vairous investigators and institutions have de¬ 
veloped ploteduies lot the selection of engineering students, and the 
Engineers Council lor Professional Development is now working on a 
large-scale stuck ol this type no so-called tests of engineering apti¬ 

tudes weie published until the appearance on the market of the F//g/- 
ncnni'H and J’hysiuil Snetitr .latitude Test (Psy c hological C01 poration, 
jpg;). Oddly enough, this is not, at least at piesent, a test fot selecting 
students lor colleges ol engineering. It was developed in connection with 
the war-industry naming piogiam at the Pennsylvania State College, 
and so has nm ms lor miscellaneous young men and women, some of 
them not high school graduates, who applied lor technical training at 
the trade and technician level in connection with war industries. This 
test, 01 lather batte l v ol te-sts, is not a test with custom-built items in the 
sense- in which that team is used hcuc. Instead, it consists ol items fi om 
existing tests ol special aptitudes, selected on the basis of item validities 
to constitute a new battery. The items were therefore custom-selected, 
but not custom-built; they are of possible general significance, rather 
than drawn bom and ustiicted to the' local situation. It is only the 
weights and norms which are custom built. 

The tests bom which the items were selected on the basis ol local 
validities were- the Iowa Physics Aptitude Test (revised), which provided 
the Mathematics, Formulation, and Physical Science Comprehension 
Tests; the Moore lest of Arithmetic Reasoning, which supplied the 
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Arithmetic Reasoning Test; the Bennett Test of Mechanical Comprehen¬ 
sion, from which came the Mechanical Comprehension Test; and the 
Moore-Nell Examination for Admission to Pennsylvania State College, 
vocabulary section, which provided the Verbal Comprehension Test. 
These are all tests which have been found to have some value in predict¬ 
ing success in technical and engineering courses, but until engineering 
norms have been provided lor this form of the test it must be considered 
more likely to be dangerous than helpful in selection or counseling. A 
high school senior might, for example, compare very favorably to the 
norm group of miscellaneous young men and women, some of whom 
did not have the academic ability to finish high school, but find it diffi¬ 
cult to compete with typical college freshmen (confirmed by a study by 
Fagin at Brooklyn Polytechnic Institute). On the other hand, since the 
items have all been twice selected on the basis of \alidity for predicting 
success in technical training at some level (in the original test and in this 
battery) the battery should be a \ery ptomising one on which to collect 
local data and to establish local norms. An engineering or technical 
school which cannot at once invest much money in test construction and 
validation would probably find that this battery piovidcd a ready basis 
for establishing local selection criteria. 

As currently available information concerning the Engineering and 
Physical Science Aptitude Test is limited to the original study and is 
contained in the manual and in the at tide by Griffin and Borow (31.j), 
work with it is not discussed here in any detail. It should suffice to say 
that correlations between scores on this test and giades in technical 
courses ranged from .13 to .71, depending upon the course and the sub¬ 
test, and that the con elation between total score and average grade was 
.73. Subtests showed higher correlations with grades in the types of 
courses with which one would expect them to be related than in others: 
the correlation of .71, for example, was between the Mathematics score 
and grades in mathematics, whereas a correlation of . 1\ was found for 
Mathematics score and grades in a course in manufacturing processes. 

Attempts to develop batteries of tests for selecting engineering students, 
in which standard tests have been used as tests rather than as sources of 
items, are perhaps best illustrated by studies conducted by Holcomb 
and Laslctt (375), Laycock and Ilutchcon (.J5O), and Brush (122). Hol¬ 
comb and Laslett used the MacQuarrie Mechanical, Stenquist Picture, 
and Stenquist Assembly Tests, and the Strong Vocational Interest Blank 
(engineer scale). They computed no multiple correlation coefficients, 
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although they found validities of .48, .15, .43, .16, and .32 respectively. 

Laytock and Hutcheon used the National Institute for Industrial Psy¬ 
chology (England) Form Relations Test, the Cox Mechanical Aptitude 
Tests (Models and Diagrams), and the physical science score on the 
Thurstone Interest Inventory, together with high school grades and 
scores on the A.C.E. Psychological Examination. The best combination 
of this group consisted of marks in grade 12, A.C.E. score, Form Rela¬ 
tions, and Physical Science Interest, the multiple r being .66. 

Brush’s study included the Minnesota tests, the Wiggly Block, the Cox 
Mechanical Aptitude Tests (Explanation, Completion, Models), the 
MacQuarrie, the Thorndike Intelligence Examination, and the Columbia 
Research Bureau science tests; not all students took all tests, as he worked 
with two groups. The multiple correlations for all tests and four-year 
engineering grades were .51 for one group (no intelligence test included) 
and .61 lor the other (including intelligence test). With the first group 
the best battery was probabh that consisting of the Minnesota Paper 
Form Board and the Cox Models, with an R of .}6. For the second group 
the best batteries were one consisting of the Thorndike, C.R.B. Algebra 
and Geometry, Cox Models and Completion, Minnesota Paper Form 
Board and Interest Analysis, with an R of .59; another consisting of the 
C.R.B. Physics, Chemist]y. Geometry, and Algebra Tests, for which the 
R was .585; and a third made up of the Thorndike, C.R.B. Algebra, Cox 
Models, and Minnesota Interest Analysis, with an R of .59. The highest 
correlations for single tests were lor C.R.B. Algebra and Physics, and 
Thorndike Intelligence, respectively .51, .50 and .43. 

It will be inteiestmg and \aluable to see, at some future date, the 
relative validity of a battery such as the EPSA Test, in which items have 
been custom-selected, when compared with batteries of standard tests 
such as these. 

Lawyers. Tests and test batteries for the selection of law students 
have been developed at a number of universities, notably California, 
Columbia, Iowa, Michigan, Minnesota, and Yale, and most recently In 
the Educational Testing Service (28:8); a recent review of work with 
these and other tests in law schools has been prepared by Adams (4). The 
pioneer test in this field appears to have been the Ferson-Stoddard Laze 
Aptitude Examination (West Publishing Co., St. Paul, Minnesota, 1927). 
It consists of four parts: a reading comprehension and recall (after the 
other parts) test based on a law case, a reading comprehension and rea¬ 
soning test based on another case, a verbal reasoning test, and a reading 
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comprehension test based on legal material. The test has been used only 
by law schools and has not been available to counselors. Person and 
Stoddard (254) found that the Law Aptitude Examination had a c01 rela¬ 
tion of .54 with the first-year Jaw grades of 100 students at the Univcisity 
of Iowa; as summarized by Adams (4), subsequent studies of the test 
yielded validity coefficients of .54 at Tennessee, .31 at Newark, .42 at the 
New Jersey Law School, .46 at Illinois (first semester only), and .49 at 
Chicago. All of these studies agree, then, in showing considerable validity 
in the test, about that shown by scholastic aptitude tests in liberal aits 
colleges; since the populations of law schools are somewhat moie homo¬ 
geneous than those of colleges, the test presumably has somewhat more 
validity. It is therefore interesting to compaie its validity, in the last 
study, with that of the A.C.E. Psychological Examination, which was 
found to be .5(1 in contrast with that of . jej for the Eerson-Stodclai cl Law 
Aptitude Examination. As the combined tests yielded a con elation of 
.62 it seems that, although they were not measiuing exactly the same 
thing, the contribution of the professional aptitude test was not gicat. 
In the Illinois study (91 (>) Welker and Han ell found that pie-law grades 
had much more piedictive value than the aptitude test (.7;, as compand 
to .46), and that the correlation for the combined indices was not much 
higher (.78). Only tlnee of the six scores of the Eeison-Stoddard test 
(Part 2 of which yields three scores) we re found to base any appieciable 
correlation with grades: these were Part 2C, Relevant l'acts, Pan 3, 
Logical Inferences, and Part 4, Matching; these validities wen: .17, .28, 
and .31, respectively; the validities of A.C.E. pan scores weie of the same 
order, but mote consistently so. The implication is that a good gene t al 
intelligence test is at least as useful as this ptolessional aptitude test, 
espec ially wheat one notes, with Welket and Hat t e ll, that the e!iecti\e 
law aptitude subtests are the leasoning rather than the* “legal memois” 
tests. Studies at the Lnheisity of Minnesota (2oli) obtained eonelations 
of custom-built tests with law grades which were as good as those lot 
intelligence tests, but multiple eonelations permitting comparison weic 
not reported. 

In 1943 Adams (4) published studies of a new Iowa Legal Aptitude 
Test, developed for use in the same institution as the* Eerson-Sioddaid 
nearly twenty years eailier. Its pieliminary form consisted of eight sub¬ 
tests, the first three of which are not legal in content, while* the last h\c 
are. Part 1 is a verbal analogies test, Part 2 is a mixed lelations or mote 
complex analogies test, Part 3 contains opposite items of a verbal type. 
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Part 4 is a test of memory for material in a judicial opinion read before 
Part j or two hours earlier, Part 5 is a reading comprehension test stress¬ 
ing judgments of relevance. Part 6 is also a reading comprehension test 
adapted from Part 2 of the Ferson-Stoddard Test, Part 7 is a verbal rea¬ 
soning test, and Part 8 is a legal information lest. When the first-semester 
grades of 110 law students were con elated with part and total scenes 
on the Legal Aptitude Test, the former weie found to range from .36 
(Part 5, reading compiehension for relevance) to .57 (Parts 3 and 7, 
verbal opposites and verbal reasoning), while the validity of the total 
score was .hr,. It was decided to use Parts 3, 7, and 8 (verbal opposites, 
verbal reasoning, and legal information) in the final form of the test; 
the multiple correlation of these subtests with the criterion was .G7, 
higher than that of the total score on the preliminary form of the test. 
Although no comparisons were made with general intelligence tests, 
comparison with the predictive value of achievement tests and pic-law 
glades indicated that in this case the professional aptitude test had more 
validity than the non-spec iali/ed indices. This was presumably because 
the professional aptitude test was, itself, a highly refined test of general 
intelligence, couched in trims most appropriate to the field in question, 
to which was added an interest-achievement lactor by the inclusion of a 
subtest of legal inlorrnation. 

Xloses. Batteries for the selection of nursing students have been de¬ 
veloped by a number of university schools of nursing, and by independ¬ 
ent organizations or individuals working on a consulting basis with 
musing schools. 

The George Washington University Series of Xursing Tests (Center 
for Psvc hologic a) Service*, George Washington University, 19 ] t) was de¬ 
veloped (lorn the* Moss Hunt Nursing Aptitude l est, first published in 
1931 and available to counselors. The series incorporates a modified loim 
of the Nursing Aptitude Test, consisting of five parts, as follows: judg¬ 
ment in nursing situations, memorv for anatomical diagram and nomen¬ 
clature studied during the* test, nursing information, scientific\ocabularv, 
and following directions in filling out a nurse’s report form. This test is, 
obviously, custom-built as to items, drawing heavily on the technical 
content of nursing as it might have been experienced before training or 
as presented in the test itself. A second test in the series is a Reading 
Comprehension Test, utilizing matciial from commonly used textbooks 
in nursing schools. The third test is an Arithmetic Test; the fourth is a 
General Science Test based on high school courses; and fifth is an 
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Interest-Preference Test somewhat resembling Strong’s Vocational In 
tcrest Blank, the items of which were selected because they differentiated 
nurses from non-nurses. Norms based on high school graduates applying 
for admission to nursing schools are provided with the manual, together 
with suggested critical scores for interpretation, but it is recommended 
that local norms be developed because of differences in standards. No 
indication of the numbers on which the standardization was done is 
included, nor are there any data on the validity or reliability of the new 
series of tests. No references to them have been located in the literature. 
Although the test items look promising and have obviously been based 
on the best available experience in nurse selection programs, validation 
data such as have been provided by other investigators using the earlier 
form of the Aptitude Test for Nursing are needed. In these studies of the 
earlier form of the first subtest of the present series Douglass and Merrill 
(209) found correlations ranging from .54 to .62 with grades in the first 
year of nursing school at the University of Minnesota, and Williamson 
and others (929) found correlations of .34 and .37 with grades in twenty 
schools of nursing. As these grades were very unreliable in some schools 
the validity seems lower than it actually was; in one school with a better 
marking system the validity was .49. It seems clear that this one part of 
the present series is what the manual suggests: a “specialized intelligence 
test for prospective nurses.” The other subtests appear to be specialized 
achievement and interest measures for prospective nurses, but need to be 
evaluated as such. 

The Nursing Entrance Exainination Program of the Psychological 
Corporation has developed another battery of tests for use in schools of 
nursing. This battery is administered periodically at various centers 
throughout the country, by arrangement with co-operating institutions; 
it is not available for general use. Unlike the battery developed by Hunt, 
it consists of standard tests found useful in selecting nursing students 
rather than of custom-built tests; it is only the norms that are custom- 
developed. The program has been described by Potts (611). 

Other standard tests have been used in studies referred to earlier (209, 
929), conducted at the University of Minnesota and co-operating schools 
of nursing. In these it was found that standard tests of vocabulary (Co¬ 
operative Test Service), English, and General Science had substantial 
validities, as high as .44, .53, and .58 in one school where marking was 
reasonably reliable. Douglass and Merrill found a validity of .77 for the 
Moss-Hunt Test of Nursing Aptitude and the Co-operative General 
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Science Test. Crider (181) found that the Strong Interest and Bell Ad¬ 
justment inventories added little to predictions based on the Otis Test 
of Mental Ability, confirming Douglass and Merrill’s correlation of .20 
for Strong’s nurse scale and grades. 

Pharmacists. Until recently little attention was paid to the scientific 
selection of students of pharmacy, and little was known concerning 
psychological factors related to success in this occupation. During World 
War II, however, mein bets of the occupation became more self-conscious 
as a profession, even to the point of changing the status of pharmacists in 
the Army from enlisted to commissioned grade. Since the War the Amei- 
ican Pharmaceutical Association has been engaged in a co-opc*rati\c 
study with the Ameiican Council on Education, one of the put poses of 
which is to develop better methods of selecting pharmacists, and Schwebel 
(unpublished stuch) has developed a pharmacist scale for Strong’s Voca¬ 
tional Interest Blank. 

Physicians. The Moss Medical Aptitude Test (Association of Ameri¬ 
can Medical Colleges, 1930) was for many years the standard instru¬ 
ment lor the selection of medical students, used by most medical schools 
in the United States and not available to others. New forms weie pto- 
\ided periodically, but the content is rather like that of the Moss-Hunt 
Nut sing Aptitude Test which has already been described and which was 
based in part on Moss's experience with the Medical Aptitude Test. 
Parts deal with comprehension and retention, logical reasoning, scientific 
vocabulary, etc., making the test one of intelligence measured by means 
of medical material. Some of the studies published in the Association’s 
journal have shown that there is a tendency for high-scoring applicants 
to succeed in training and to be rated favorably as interns, whereas those 
who make low scores tend to do poorly. Moss (550) reported that one 
percent of the top-decile students failed, as contrasted with 1S percent 
of the bottom-decile students. Chcsney (155) found that refusing to admit 
anyone in the lowest decile would eliminate 25 percent of the failing 
students, 15 percent of the mediocre students, 7 percent of the fair stu¬ 
dents, and only 3 percent of the good students. But Douglass (203) ami 
Cavett and others (152) found validities of only .12 to .34 for \arious 
classes at the University of Minnesota, compared to .40 to .57 for lihcial 
arts grades. Moon (535) found a closer relationship at Illinois, where the 
validity was .42 and liberal arts grades had a validity of .49. The Min¬ 
nesota Medical Aptitude lest, another custom-built battery, had valid¬ 
ities of only .14 to .40. Strong’s Physician scale had a validity of onh 
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.if) for 131 students, using first-year honor points as criterion. Stuit (78.}) 
obtained correlations oi .23 and .32 between the Moss Test and first-yeai 
grades in medicine at the University of Iowa, as compaied to correlations 
of . and ..jf> between college grades in liberal arts and science courses, 
on the one band, and medical grades on the other. The Moss Test and 
science grades yielded a multiple correlation of only .39. These studies 
suggest that, although the Moss and Minnesota Medical Aptitude Tests 
have some \alue in selecting medical students, they do not add much to 
predictions made on the basis of undeigraduate college grades. Appai- 
entlv further studv and development of new tvpes of instiuments is 
needed in this held. In the meantime, the standard measure's of intelli¬ 
gence and achievement in appropriate areas will probably prove as use¬ 
ful as the professional aptitude test in appraising promise in this held. 
The Educational Testing Service (28:9) now handles this admission¬ 
testing program for the A.A.M.C. 

Pilots. Apart fiom embivonic efforts in the hrst Woild War, tests 
for the selection of aircraft pilots were hrst developed early in Woild 
War II bv the' Civilian Pilot Training Program of the Civil Aeronautics 
Administi ation, the woik of which was summari/eel bv Vitelcs (90 ; ); 
weie fui ther developed for the T.S. Navy under Jenkins’ leadership 
(399); and espec ially bv the Armv Air Eoiees Aviation Psychology Pi o- 
gram (21J,2(13) under Flanagan. The* most far-reaching o 1 tliesc*, both in 
the varie ty of tests used and in the extent of its validation piocedmes 
was the last named: as it included tests comparable to those oiiginated 
by the other two programs, only it is described here. 

The Aviation Cadet Classification liatteiy (T.S. Air Force, 1912, re¬ 
vised in 19)3 ar, d subsecjuently) consisted of a personal history question¬ 
naire arranged in multiple-choice form and stressing experiences and 
background factors which had been found related to success in living 
training, two spatial orientation (perceptual) tests utilizing aeiial photo¬ 
graphs and maps, a reading compi(Tension test, a dial and tabic reading 
test involving taking readings fiom airplane instruments and aeionauti 
cal tables, two instrument comprehension tests also based on /light instiu¬ 
ments, a mechanical principles test based on tile Kennett, a general 
information test presumably tapping interests and personalitv traits 
underlying the possession of information found to be related to success 
or failure in flying training, two mathematics tests, a rotary pursuit 
(eye-hand co-ordination) test, a lathe-type two hand co-ordination test, 
a stick-and-rudder test in which controls are moved to match light signals 
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appearing in a prearranged pattern, a rudder control test in which the 
examinee’s seat is kept in equilibrium by movements of the rudder with 
the leet, a discrimination-reac tion-time test requiring the selection of a 
switch to be moved in older to put out a series of lights, and a pegboard 
measure of finger dexterity (21.1). Most of these, it may be noted, in¬ 
volved custom-built items: the biographical data items were written to 
tap aspects of cxpci ience which might be related to flying success, the 
perceptual items involved perception of the type used in pilotage, the 
eye-hand-foot co-ordination test used a stick and ruddei, etc. Although 
correlational analysis techniques weie used to insure relative independ¬ 
ence of the* tests, the miniature-situation element was strong in most 
of them. 

As is necessary in a custom-built selection testing program in which 
conditions are constantly changing, these tests, their antecedents, and 
their successors, weie continuously validated as data concerning new 
niterion gioups were* received. Ihe most impressive of these validation 
studies (im j: Ch. r,; :>f>j) was made with a group of 11.jy, candidates foi 
aviation cadet training who weie sent to pilot training regardless ol their 
scoies on ps\chological tests. Analvses were made to reveal the compaia 
ti\c* validity of the psychological tests, the cadet selection batten as a 
whole, the Adaptability Rating for Military Aeronautics (psychiatric 
examination), the* Army (hireral Classification Test, the Aviation ‘Cadet 
Oualifying Examination (custom-built intelligence test used in pre- 
liminarv screening), and years of education. Data are reproduced in 
Table 2b (p. 350). 

The correlations given are with success in training through advanced 
living school, that is. with ability to win wings and a commission. Out 
standing in the above data are the following facts: 

The three most valuable tests are paper-ancl-pencil tests: 

The most valid tests are custom-built even in item content; 

The battery has more predictive value than the best single test; 

Objective tests have more predictive value than psychiatric judgment. 

Later work with this battery has involved the factor analysis of these 
and certain other tests (316.317). the refinement of the most promising, 
the addition of subsequently developed tests to the battery, and, since 
the end of World War II, an ambitious joint project of the Air Force. 
Navy, and American Institute for Research in which a battery of paper 
and-penc il tests is being developed which will measure with maximum 
economy all of the characteristics which have so far been found to con- 
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Table 26 

RELATIVE PREDICTIVE VALUE OF CERTAIN CUSTOM-BUILT AND 
STANDARD PSYCHOLOGICAL TESTS AND CERTAIN OTHER INDICES 

for success in pilot training (After DuBois) 


Test Validity 

General Information .51 

Pilot Instrument Comprehension II .48 

Tests Mechanical Principles .43 

Complex Co-ordination (Stick-Rudder) .42 

Discrimination-Reaction-Timc .42 

Spatial Orientation II .40 

Dial and Table Reading .40 

Rudder Control .40 

Two-Hand Co-ordination .36 

Biographical Data .33 

Stanine (Battery score) .66 

Aviation Cadet Qualifying .30 

Army General Classification .31 

Education .21 

Flying Adaptability Rating (Psychiatric) .27 


tribute to living success. Studies A\eie also made which ascertained the 
predictixe \aluc ot the wartime battery and its components for success 
in combat ([by); this was found to be significant, although attenuated 
bv the relatively small and select group of pilots which reached combat 
and the complexity of the criterion: the number of planes shot down 
b) a fightei pilot in England in icjji» cannot be compared, for example, 
to the numbe r shot down in the same theater in 1915 when air superioritx 
had changed hands and daylight bomber raids were unknown. 

The American Institute for Research, established by Flanagan and 
other aviation psychologists on the basis of their wartime experience, 
has carried out a number of research projects lor the- Civil Aeronautics 
Administration and sexeral of the commercial airlines, analyzing tlie* 
work of the airline pilot and constructing a battery of tests for the 
evaluation of pilot proficiency which might be used in selecting person¬ 
nel for commercial airlines. T he Institute has established testing centers 
at which the current form of this battery is now being used in such 
selection, but data concerning it have not yet appeared in the literature. 

Psychologists . The post-war demand lor clinical and vocational psy¬ 
chologists resulted in a great increase in the number of candidates for 
training in psychology and a strain on training facilities. Graduate de¬ 
partments of psychology, the Veterans Administration, the U.S. Public 
Health Service, and the American Psychological Association worked 
together on the problem of improving the selection and training of psv- 
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chologists. One result oi this co-operation was a project carried out at 
the University o£ Michigan, under the direction of E. L. Kelly, lor the 
development of a battery of tests for the selection of students for train¬ 
ing in clinical psychology; another project provided for the study and 
revision of the psychologist scale for scoring Strong’s Vocational Interest 
Blank under the direction of D. G. Paterson of the University of 
Minnesota. 

Salesmen. More attention has been devoted by business and industry 
to the problem of selecting salesmen than to any other single group save 
possibly executives. Unfortunately too many business concerns have been 
so near-sighted that they have been willing to employ psychological 
consultants lor actual selection work but have not been willing to finance 
the research which should precede the development of any new method 
or instrument, whether it be psychological, chemical, or mechanical. 
Even scientifically trained executives such as engineers often fail to 
realize that developmental work must be done in personnel selection 
just as in manufacturing. And there have too often been psychologists 
and pseudo-psychologists available who were willing, either through 
ignorance of the complexities of personnel testing, or through eagerness 
to supplement academic incomes, to attempt to meet the needs of busi¬ 
ness and industry on their own inadequate teims. So-called institutes foi 
aptitude testing therefore flourish in most of our large cities, testing 
candidates for sales positions and making recommendations to referring 
cmplovers which are based to an undetermined extent upon hunches and 
shrewd judgments made independent!} of the tests, and partly upon 
clinical evaluation of test scores, as described by Flemming and Flem¬ 
ming (266) and discussed in connection with executives, above. 

The Moss Test for Ability to Sell (Center for Psychological Service, 
George Washington Univcisitv, 1929) is one of the few tests or batteries 
of tests marketed as a device for selecting salesmen. It consists of items 
designed to test memon for names and faces, judgment in sales situa 
lions, observation of behavior, comprehension and retention of selling 
points in reading material, following directions in making out sales 
records, and sales arithmetic, and has norms based on department stole 
salespersons. Although it has been tried in numerous sales situations, the 
results have not generally been published in the journals. The prevailing 
opinion of it among department store personnel woi kers known to the 
writer is not favorable. 

Flic majority of researchers who have experimented with test batteries 
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lor the selection of salesmen have utilized personal histoiy blanks, 
interest inventories, and personality inventories, as well as intelligence 
tests. The first-named are generally custom-built, the second is usually 
Strong’s Vocational Interest Blank as a source of either a score obtained 
from a standard key or of items for the development of a new key, and 
the personality measures have included the Bcrnrcutcr Personality 
Inventorv, the 11 umm-Wadsworth Temperament Test, or other well- 
known inventories. For example, Bills (90) reported on the use ol the 
life insurance and real estate salesman’s keys of Strong’s Blank, personal 
data, the Bcrnrcutcr, and a mental alertness test; the last two were of 
little value, but the others, combined, significantly improved the selec¬ 
tion of successful salesmen. Kurtz (| j<)) worked with life insurance sales¬ 
men, using personal history items and Kornhauser’s personality imentory 
and obtaining correlations of . jo with pioduetion. Men who rated A had 
twice the chance of staging in the business lor a year that men with E 
ratings had. Similar findings ha\e been repented with salesmen ol more 
tangible things than life or casualty insurance. Otis (5X0) used peisonal 
data items, a combination of Strong’s life insmancc and real estate kc\s 
and the Bernreuter, with salesmen ol a detergent coinpam, finding that 
the first two were effective predictois ol success while the last named test 
was not. Building materials salesmen were studied bv Ohniaim (',72). 
who used only personal data; he found a con elation of .b 7 between a 
questionnaire of 13 items and liis most reliable* ciiteiion, annual com 
mission earnings. Yitclcs (902) ti ied the Humm-Wadsworth Tcmpeia- 
ment Test with 59 appliance salesmen, but found that 12 of the 20 who 
had “desirable” patterns were discharged or lcsigneel during the tiv-out 
period. 

From studies such as these, more thoroughlv reviewed by Schultz (liX^) 
and by Kornhauser and Schultz (jp^), the conclusion to be* diawn is that, 
contrary to the* expectation of many personnel consultants, peisonalitv 
inventories have little or no value* in the selection ol salesmen. The 
reasons lor this will be discussed in a later chapter dealing with such 
instruments. 'The most effective batteries have consisted ol the sales keys 
of Strong’s Vocational Interest Blank and, especiallv, custom-built per¬ 
sonal history questionnaires. The nature of tlu* personal history items 
which prove valuable varies somewhat with the type of saleswork, but 
some consistent trends are revealed. In Ohmarm’s study the* 1 > valid 
items were as follows: height, age, tnaiital status, number of depe ndents, 
amount of lilc insurance, debts, years of education, number of clubs and 
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organizations belonged to, years on the last job, experience in the line of 
sales in question, average number of years on all jobs, average monthly 
earnings on the last job, and reasons for leaving the last job. Il is notable 
that, although these salesmen were handling a tangible, building mate¬ 
rials, the success o( life* insurance salesmen lias also been lound to be 
related to age, marital status, dependents, amount of life insurance, 
organizations belonged to, etc. (91). Stokes (762), reviewing what experi¬ 
ence has shown to be important in reseat ch in the selection ol salesmen, 
has like others emphasi/ed the need to take into account the job environ¬ 
ment ol the salesman, pointing up the* lact that, despite the similarities 
which exist between sales jobs, and the more or less universal validity 
of Strong’s sales keys, specific iactots are lound in any job which make 
custom-built batteries ol tests more \aliel than standaid tests. His second 
point then follows of necessity: research in the selection of salesmen 
must be dynamic, for it must continue to take into account the changes 
which take- place in the emironment in which the salesman is working 
and theiefore in the demands ol his job. The fact that Strong’s Voca¬ 
tional Intel est Blank has been found to predict success in sales jobs, but 
in \et\ lew othci occupations (see the discussion of Strong’s Blank in 
Chapter 17), appeals to confirm this point concerning the special impoi- 
tance ol interest and motivational factors in selling. 

Semitists. The importance of scientific occupations was emphasi/ed 
as newer before during World War II and its aftermath, when some* 
countries such as Gieat Biitain kept their science students and scientists 
ch alt-exempt because of then potential contributions to the war effort, 
and when the* \aiious Allies engaged in a scramble for the talents of the 
scientists of the* concjueicd countries, particularly Germany. Although 
theie ha\e hern small scale attempts at the development of techniques 
for predicting success in science prior to the second W 01 Id War, it was 
only dining and after it that national efforts weie organized to locate 
scientific talent and to encourage its training. With such ability at a 
premium it seems likely that its selection will receive even more atten¬ 
tion in the future than medicine has in the past and than psychologv 
is lecciving at the* time* of writing. 

The Stanford (or Zyve) Saentijic Aptitude Test (Stanford University 
Press, 192c)) is probably the first published attempt to de velop a measine 
of scientific aptitude*, but little woi k has been done with it since either 
by its author or by others despite its continued use. The test attempts to 
measuie the components of scientific aptitude, science being defined as 
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organized knowledge based on experiment and observation. Flic test 
therefore consists of eleven parts, designed to measure experimental bent 
by expressions of preference for experimental as opposed to bibliograph¬ 
ical or other methods of obtaining information, clarity of definitions, 
suspended versus snap judgment as manifested in ability to state that 
answers to pioblems are not available, reasoning concerning physical 
problems (in four parts differing in content), caution and thoroughness 
as demonstrated in the solution of apparently easy problems, ability to 
select and arrange experimental data for the solution of a problem, 
comprehension of scientific reading matter, and perception of complex 
spatial detail. The items were developed and checked with the aid of es¬ 
tablished scientists, and were validated against grades in scientific courses. 

I he correlation with intelligence tests, according to the manual, was 
found to be .51 with college students. The correlation with the grades of 
science students was .r,o, in contrast with that of .27 lor the Thorndike 
Intelligence Examination; the correlations with grades ol non-scientific 
students weie respectively .02 and from .38 to .53. which stionglv suggests 
that the test does measure intellectual factors which ate important to 
success in scientific but not in liteiaiy endeavoi. 

The StanloicI test was administered by Kenton and Pcny (7',) to .pj 
students (30 science majors, 13 otheis) at the College of the City of New 
York. They found correlations of .30 and .37 between this test and font- 
year grades, while intelligence as measuied by the A.C.E. Psychological 
Examination had a validity of .31 with total grades. .27 with science 
grades, and .ji with non-science grades. The intercorrelation of the two 
tests was .pp Studies of this test have been so jew and ate so inconclusive 
that it is difficult to judge its validity, especially when the attenuation 
of validities usually noted in studies made after the oiiginal authors’ are 
kept in mind. 

The Science Talent Search administered by Science Service and 
financed by the Westinghouse Electric C01 potation is a project in which 
one might expect to find a battery of tests for the selection of potential 
scientists being developed. The selection procedure consists of a series 
of five hurdles: a Science Aptitude Examination, high school giades, a 
recommendation by teachers, an essay on a scientific topic, and psycho 
logical and psychiatric interviews (235). The Science Aptitude l est fust 
used was a reading test of scientific subject matter, but in later yeais 
what amounted to a battery of tests was utilized. A vaiieiy of types ol 
items were used, including both scientific vocabulary and Kennett-type 
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mechanical comprehension pictures; scores were independent of amount 
of mathematics and science studied; but validity data have not as yet 
been made available, making evaluation of the procedure impossible at 
present. 

"Scientific aptitude” being presumably largely an intellectual matter, 
it seems likely that batteries of tests for the selection of promising sci¬ 
entists will stress such factois as reasoning, spatial visualization, and 
number ability; scientific \ocabulary and mechanical comprehension arc- 
two less pure aptitudes which should also be significant; and inventoried 
interest may prove to have value lor completion and occupational utiliza¬ 
tion (->1 training il not for quality ol work done. It seems strange that 
work has not been done with such a battery. 

Teachers, Tests of aptitude for teaching Irave been experimcntecl 
with by a number of individuals arrd schools of education, in attempts 
to improve the selection of students of education. The New York State 
Department of Education arrd the Psychological Service Center of George 
Washington University, are among the institutions which have published 
custom-built tests of so-called teac hing aptitude. Other institutions such 
as the University of Wisconsin and the University of California at Los 
Angeles have worked with batteries of standard tests in attempting to 
develop sound selection procedures. Tests for the evaluation of prepared¬ 
ness foi teaching have been prepared by the Educational Testing Service 
as the- National Teacher Examinations (28:9), administered annually to 
candidates for teaching positions who wisli to have an objective record 
ol their mastery of subject matter made available to possible employers 

'The Coxe-O) leans Prognosis Test of Teaching Ability (World Book 
Co., 1950) is a good example of custom-built tests of aptitude for teach¬ 
ing. It consists ol five subtests: general information, knowledge ol 
leaching methods and practices, ability to learn the tvpe of materia! 
included in piofessional texts, comprehension of educational reading 
matter - , and judgment in handling educational problems. Validation 
ol this instrument has been in terms of success in teacher training, but 
the data are not very helpful because they consist of correlations between 
the prognostic test and a comprehensive achievement test at the end of 
the first year of training. These coefficients range from 53 to .84 as cited 
in the manual; but in view of the highly academic nature and similar 
content of both tests the evidence is not convincing. 

It has apparently not been validated against criteria of success on the 
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job. In view of the difficulties commonly encountered in establishing 
criteria of success in teaching this is perhaps understandable. I'hc validity 
coefficients will undoubtedly be much lower than those reported in the 
manual, since teaching is less exclusively dependent upon intellectual 
ability than is learning about teaching. 

Seagoc’s studies (t) 8 f, 688 , 68 (j) are a good illustration of work with 
standard tests in the selection of students in schools of education. She 
administered the American Council Psychological Examination, Co¬ 
operative General Culture Test, Meier Art Judgment "Pest, Seashore 
Tests of Musical Talents, Strong Vocational Interest Blank, Allport- 
Veinon Study of Values, Bell Adjustment Inventory, Bernrcutcr Person¬ 
ality Inventory, and HummAVadsworth Temperament Test, to 125 
students of education. Ratings of success in two practice-teaching assign¬ 
ments were obtained for 31 of these students, and were correlated with 
the test scores ( 088 ). No significant relationships were found between 
the tests of intelligence, special aptitudes, achievement, interest, or values 
and the ratings of success in practice teaching; relationships between 
personality inventory scores and ratings weic significant, those for the 
Bell keys being — 90 (total adjustment) and that for the Bernrcutcr 
Self-Confidence scale being —.38. Twenty-five of these students were 
followed up after two years of teaching in the field, using tank in the 
faculty as judged by the school administrator as ciiterion ((>89); the Bell 
and Bernrcutcr were again found to have some validity, as did latings 
by ciitie teachers; grade-point ratio had none. 

The numbers in Seagoe’s studies, as in other studies of the same type, 
are small and criteria of success need to be improved, before objcctixe 
selection procedures can be considered adequate in this field. But as long 
as teaching remains an underpaid occupation with too few applicants 
for available positions there is not likely to be much pressure for the 
development of better selection methods, at least in most training 
institutions. 

The National Teacher Examinations (Educational Testing Service, 
annually since 1939) provide school systems and graduate schools of 
education which can afford to be selective with a standard battery of tests 
for the evaluation of teachers’ mastery of subject matter, reasoning, and 
judgment. These are, obviously, only intellectual aspects of ability to 
teach, and do not include interest in children, emotional stability, and 
other factors which are generally believed to be important to teaching 
success. But Flanagan (263) found that scores on this battery of tests 



CUSTOM-BUILT BATTERIES FOR SPECIFIC OCCUPATIONS 357 
had a correlation of .51 with ratings of .jg teachers in 22 school systems 
made by two supervisors and five students in each case, which indicates 
that the tests have value in selecting good teachers despite the fact that 
they do not measure everything that is to be considered. As Flanagan 
points out, other characteristics must be appraised by means of inter¬ 
views, ratings, and recommendations in the absence of more objective 
methods. 



CHAPTER XI 


STANDARD BATTERIES WITH 
NORMS FOR SPECIFIC 
OCCUPATIONS 

THE characteristics, advantages, and disadvantages of standard batteries 
consisting of generalized items which can be validated and weighted as 
tests rather than as items, and lor which norms can be developed for a 
great vaiiety of occupations, have been discussed at the beginning of the 
preceding chapter. In this chapter, therefore, it is necessary only to de¬ 
scribe and discuss the two such batteries which are currently coming into 
use: the Gcneial Aptitude Test Battery of the United States Employment 
Service, and the Differential Aptitude Tests of the Psychological Corpo¬ 
ration. It might be added parenthetically that other such batteries arc 
being published, notably by Guilford (320) and by the American Institute 
for Research, but that the task of obtaining occupational norms is so 
great that only well-financed organizations can ethically undertake it. 
The day of the publication of isolated tests of single aptitudes will no 
doubt soon be past. 

The Geneial Aptitude Test Battery (United States Employment Service, 
Ml 17 ) 

I his battery is the product of more than ten years of research in worker 
characteristics and test development by the Occupational Analysis Divi¬ 
sion of the United States Employment Service, described most completely 
in two journal articles by Shartlc, Dvorak, Heinz, and others (735,225). 
This comprehensive program of research in vocational aptitudes was 
itself the outgrowth, insofar as principles and technical matters are con¬ 
cerned, of the Employment Stabilization Research Institute of the 
University of Minnesota, the work of which has been frequently en¬ 
countered throughout this book. With such a long and fruitful history 
behind it, it is natural to expect that this battery should prove a land¬ 
mark in the history of the appraisal of vocational promise. Dvorak’s 
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description of it (224,225) encourages such expectations. She states: 
“With the General Aptitude Test Battery, however, it is possible to 
obtain information about an individual’s aptitude for several thousand 
occupations in little more than two hours of testing.” It is explained 
in another paragraph that this is done by means of norms for 20 fields 
of work, representing nearly 2000 occupations grouped as in Part IV ol 
the Dictionary of Occupational Titles (888) but, in this case, on the 
basis of similar minimum amounts of the same combination ol aptitudes. 

This represents a very real accomplishment, as is evidenced by the 
meagerness of the occupational non ns which discussions of the majority 
of tests in this book have revealed. Unfortunately, Dvorak’s two identical 
articles (224,225) are lacking in many of the details which are necessary 
for judging the adequacy of the norms, validities, and other basic data 
concerning tests, data which are now rather loutinely reported by pro¬ 
fessionally competent and ethical test constructors and publishers. As 
the tests are not available for use outside ol the United States Employ¬ 
ment Service and cooperating public schools this is not a matter of 
practical urgency to counselors as users of tests, but it is a matter of 
great importance to personnel men, to the federal and state governments, 
and to the profession as a whole that the tests used by Employment Serv¬ 
ice Counselors be not only adequate but demonstrably so. In the dis¬ 
cussion which follows, based on Dvorak’s articles, on the tests themselves, 
and on the training manuals and directions which accompany them, 
some of the important unknowns will be brought out; it is to be hoped 
that subsequent publications will provide the needed information. 

Applicability. The General Aptitude Test Battery was developed for 
use with adult employment applicants, including older adolescents 
recently out of school, who arc in need of vocational counseling in 
connection with registration at the offices of the federal-state Employment 
Service. It is to be used when other evidence concerning aptitudes is 
unsatisfactory, when other important abilities are suspected, when the 
applicant has difficulty choosing among several seemingly suitable fields, 
and when the applicant needs a better understanding of his vocational 
strengths and weaknesses. No data arc available concerning the difTei- 
ences in the performances of adolescents, young adults, and older persons; 
they would be desirable as a guide to interpreting the scores of recent 
high school graduates. 

Contents. The battery consists of 15 tests, the scores of which are 
combined to yield scores for 10 factors. The paper-and-pencil tests arc 
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printed in two booklets totaling 70 pages; the apparatus tests consist 
of a rectangular manual dexterity box or pegboard and a small rectan¬ 
gular board for the linger dexterity test. The subtests in the booklets 
are as follows: Tool Matching, a test for perception of similarities and 
differences in the black and white shading of simple pictures of familiar 
tools; Name Comparison, resembling the Minnesota Clerical (Names) 
Test; H-Marking, somewhat like the MacQuarrie (Puisuit) Test; Com¬ 
putation, consisting of addition, subtraction, etc.; Two-Dimensional 
•Space, resembling the Revised Minnesota Paper Form Board; Speed, 
like the Dotting Test of the MacQuarrie; Three-Dimensional Spate, a 
metal or paper-lolcling test; Arithmetic Reasoning, verbally expiessed 
arithmetic problems; Vocabulary, a same-opposites test; Mark-Making, 
a manually more complex dotting test ; and Form Matching, like the 
analogies tests of the A.C.E. Psychological Examination. T he Pegboard 
yields two scores, one for placing and one for turning, as in the Minne¬ 
sota Manual Dexterity Test, but the pegs are smaller than the disks of 
the latter test, and both hands are used in placing. The Finger Dexterity 
Board is administered for both assembly and disassembly. The USES 
policy appears to have been to construct items as much as possible like 
those of earlier standard tests which had proved valid. 

Administration and Scoring. Administration of the Ccncial Aptitude 
Test Battery requires about two and one-cjuaiter hours. 1 he two booklets 
of paper-and-pencil tests are designed for group testing; this is true also 
of the apparatus tests, which are so constructed that in taking one part 
of the test the examinee automatically sets them up for the next test. 
Answers to paper-and-pencil tests arc recorded in the test booklets, which 
makes testing somewhat more expensive than it would be with special 
answer sheets, but the additional expense may be warranted by the 
greater ease of administration to a heterogeneous population. Stencils are 
provided for scoring, which is objective and simple. Raw score's for each 
part are changed to “converted scores” by means of a conversion table; 
these are summated by groups to provide “aptitude scores” for each of 
the 1 o factors measured by the rr ? tests. These ate standard scores, with 
a mean of roo and a standard deviation of 20. 

The ro aptitude scores obtained from the 15 tests arc described as 
follows: 

G — Intelligence: general learning ability, ability to grasp instructions 
and underlying principles. It is often referred to as scholastic aptitude. 

V—Verbal Aptitude: ability to understand the meaning of words 



STANDARD BATTERIES 3f,l 

and paragraphs, to grasp concepts presented in verbal form, and to 
present ideas clearly. 

N—Numerical Aptitude: ability to perform arithmetic operations 
quickly and accurately. 

.S ’—Spatial Aptitude : ability to visualize objects in space and to under¬ 
stand the relationships between plane and solid forms. 

P—Form Perception: ability to perceive pertinent detail in objects 
or in graphic material, to make visual comparisons and discriminations 
in shapes and shadings. 

0 —Clerical Perception: ability to perceive pertinent detail in verbal 
or numerical mateiial, to observe differences in copy, tables, lists, etc. 
It might also be called proofreading. 

A — Aiming or Eye-Hand Co-ordination: ability to co-ordinate hand 
movements with judgments made visually. 

T—Motor Speed: ability to make hand movements, such as tapping, 
rapidly. 

F—Finger Dexterity: ability to move the fingers and to manipulate 
small objects rapidly and accurately. 

M—Manual Dexterity: ability to move the hands easily and skillfully, 
a grosser type of movement than finger dexterity, involving the arms and 
even the body to a greater extent. 

It can be seen from the above that the General Aptitude Test Battery 
measures most ol the aptitudes which have so far been isolated. There 
is no measure of mechanical comprehension, but we have seen that this 
is not a factorially pure aptitude, but rather a composite of aptitude and 
experience, of which spatial comprehension is the major component. 
At tistic judgment and the musical capacities are not tapped, but they are 
of very specialized significance and perhaps wisely omitted from a general 
aptitude battery. Interests and personality are not assessed, but these are 
not aptitudes. The GATB therefore includes all of the aptitudes dis¬ 
cussed in this book, all of those isolated in earlier factor analyses of 
abilities except memory (if Thurstone’s Reasoning and Induction factors 
may be considered subsumed in G), and some newly isolated factors. 

Norms. No mention is made, either in Dvorak’s paper or in the 
manuals published for use of the Employment Service, of the number of 
persons in each occupation or field for which norms are provided. These 
may be large and representative both as to fields and as to parts of the 
country, but evidence on the matter has not been presented either to the 
public or to the staff members of the Employment Service who use 
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the tests and should have data as to their scientific basis. There is every 
reason to assume that an agency with the resources of the USES, and 
persons with the test construction experience of Shartle and Dvorak, 
would do a workmanlike job of developing norms; on the other hand, 
the admittedly preliminary work described in Stead and Shartle (750) 
involves numbers which are smaller than one would like, and the occu¬ 
pational ability patterns developed for use in selection for specific jobs 
(a series of batteries quite distinct from the GATB) were, at the time 
of writing, based on so lew cases that they were used tentatively and 
with extreme caution, and then only by well-qualified examiners. It is 
to be hoped that data on the numbers involved in each of the 20 fields 
and 2000 occupations will be made available. 

The occupational-field norms are utilized to establish cut-oil scores 
for each aptitude which plays a significant part in each field. Thus Oc¬ 
cupational Aptitude Pattern No. 1 has a cut-off standard score of 130 
for G, general intelligence, and of 130 for Verbal Ability; this pattern 
is for a field which includes literary work (D.O.T. code O-X3), creative 
writing (O-X3.1), and copy writing and journalism (O-X3.5V, the field 
might perhaps be called the Literary Field, although the fields have not 
been officially named because of the fact that the development of more 
tests and the establishment of patterns for more occupations mav change 
their apparent nature. For example, what seems to be an electrical as¬ 
sembly field will probably include other types of small, but not fine, 
routine technical assembly work as other occupations are studied. The 
cut-olf score for a given aptitude for a given occupation is that below 
which one-third of the occupational group in question were found to 
fall. The publications give no reason for the selection of this point rather 
than the quartile or some other figure. Cut-off scores in selection pro¬ 
grams are based on the percentage of satisfactory workers which would 
have been accepted, and the percentage of unsatisfactory workers which 
would have been rejected, on the basis of that cut-off point; but in guid¬ 
ance the establishment of such cut-offs is extremely difficult because of 
varying criteria. The use of production and worksample criteria in the 
preliminary USES studies (750) suggests that this may have been done* 
for the various selection batteries; if it was done for the GATB it should 
be described. If, on the other hand, the cut-off score was established at 
the 33rd percentile merely as a point distinguishing a “less able” from a 
“more able” group of workeis, so labeled on the basis of tests whose 
validity was believed to have been sufhcientlv demonsttated in previous 
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studies by other psychologists, this also should be made clear. Such a cut- 
oil score is useful, but being below it is less prognostic of failure than 
when the cut-off score is a point below which few succeed. 

Standardization and Initial Validation. No sequential pict.me of the 
development of the General Aptitude lest Battery has as yet been 
published. But it has no mean history, and an integrated account of 
the work of which it was a part would have considerable value. The 
genesis of the idea was in the Minnesota Employment Stabilization Re¬ 
search Institute, written up by Paterson and Darley (589) and by Dvorak 
(223); early work by the USES is described by Stead, Shartle, and others 
(750), but this work was still partly with published tests and did not 
(oncern the General Aptitude Test Battery: a factor analysis of these 
and other tests was published by the Staff of the Occupational Analysis 
Division in 19J5 (735); and the Dvorak articles (22.],225) describe the 
battery and the placed me of standardization and validation without 
giving any ol the results. Even the procedural material is general in 
nature. As described by Dvoiak the standardization procedure began 
with job analysis, to identify the job and define the sample population. 
Persons were then included in the sample if they were performing the 
same type of work, had passed the learning stage, and were rated satis¬ 
factory by their supenisors. Care was taken to make the samples all- 
inclusive or representative. Although she thus describes the sampling 
procedure, Dvorak savs nothing about the construction of the tests prior 
to standardization testing. Neither does she describe the validation or 
norming processes, beyond stating that the cut-off scores are placed at 
the point which eliminates the lowest third of the occupational group. 

In the paper in which Dvorak collaborated with other staff members 
( 73 a) somewhat more data arc given in connection with the USES’s factoi 
analysis study. In this report nothing is said specifically about the GATB 
but it is evident from the discussion and from the tests listed that it was 
included, along with 44 other tests. Based on this total of 59 different 
tests, administered in various combinations to groups of from 99 to 1079 
persons, or a total of 2156 individuals, in 13 different communities scat¬ 
tered across the country, tfiis is one of the most thorough factor analyses 
of aptitude tests which has been made. It is therefore regrettable that 
here, too, the presentation is rather general and omits much of the detail 
which the careful reader of the literature and the conscientious and 
insightful user of tests needs. 

Despite these limitations the report is helpful. It gives some idea o) 
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the empirical justification for using the tests which arc in the battery, 
and especially for combining them to yield factors or aptitude scores. 
This is fortunate, as it is in this respect more than in any other (except 
its occupational norms) that this battery diflers from the Psychological 
Corporation’s Differential Aptitude Test Battery. There is, for example, 
the justification for grouping three of the GATB tests (Three-Dimen¬ 
sional Space, Arithmetic Reasoning, and Vocabulary) to yield a score 
for general intelligence. One’s first reaction might be to assume that 
this was merely a catering to the layman’s desire to think in terms of 
“intelligence” because tests have yielded such scores for a generation; 
on the contrary, the report of the factor analysis makes it clear that it 
was a step made necessary by the evidence. As the authors state: “it 
appears to have some of the properties of Spearman’s G (sic), but the 
two-factor theory lias no place for group factors like V, N , or S (which 
also were isolated). On the other hand, this factor has a wider signifi¬ 
cance and is more persistent than either Thurstonc’s R or /. It appears 
to possess many of the properties that teachers, test examiners, and 
clinical psychologists would attribute to ‘intelligence’ . . . this factor 
has been designated, noncommittally, as Factor O.” In the manuals it 
is designated as G, and is uncompromisingly called intelligence. It is 
interesting that this finding of a general intelligence factor was accom¬ 
plished, not with Spearman’s two-factor statistical procedures, but with 
the use of Thin stone’s centroid method of factor analysis, which has not 
on other occasions revealed a general factor. Furthermore, the sample 
was one of young adults, aged 17 to 39, rather than one of children in 
whom maturation rates would tend to produce a seemingly general fac¬ 
tor. 

In view of the fact that studies such as these were part of the process 
of developing the battery, it seems legitimate to assume that the pro¬ 
cedures of constructing the actual tests in the battery were well conceived 
and canied out. It is to be hoped, however, that more of the details of 
this procedure, of the sampling proceduic described above, and of the 
validation procedure will be published. 

The only available evidence of validity lies in the cut-off scores lor 
the various occupational groups, and this is only implicit evidence not 
analyzed or reported as such; the published material says nothing about 
validation. The fact that the Verbal Aptitude cut-off (standard) score 
of “literary” workers is 130, while that of “copy” workers (the terms in 
quotations arc the writer’s) is joo, and that ihc Form Perception cut-off 
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score of “technical assembly” workers is 100 while that of “routine 
assembly” workers is 85, is evidence of occupational differentiation and 
therefore of validity. The 20 occupational fields, tentatively named for 
convenience’s sake by the present writer, are listed below, together with 
the codes and titles from Part IV of the Dictionary of Occupational 
Titles (888), the aptitudes required arrd representative occupations. 
They provide evidence not only of the validity ol the battery (its ability 
to dillerentiate between occupations) but also of its significance for the 
c lassilic ation of occupations. 

1. Li fenny occupations, 0-X3, require a high degree of general in¬ 
telligence and verbal ability; they include* creative* waiting, translating, 
copy writing, and journalism. 

2. Computational work, 0-X7.1, embraces the accounting occupations; 
it is engaged in by persons with a high degree of intellectual and numer¬ 
ical ability. 

3. Engineering occupations, 0-X7.4, include at least some of the cn 
gineering fields, the aptitudes required being intellectual, numerical, 
and spatial in high degiee and form perception in a moderately high 
degree. 

4. Technical-mechanical wok, 4-X2.010 and 4-X2.100, requires aver¬ 
age amounts ol intelligence, number ability, and spatial ability, and a 
fair degree of finger dexteritv. The field includes machine-shop and 
all-around mechanical repair occupations. 

r r Record work, 1-X1, 1-X2.0, involves average general intelligence, 
moderately high numerical ability, and average clerical perception. 
Included are routine computing and general recording. 

f>. Artistic design occupations, 0-X1, are characterized by moderately 
high intelligence*, average spatial ability, and moderately high form 
perception. The field includes artistic drawing and arranging. 

7. Technn al-elec trical wo)k, 4-XG.18, requires a fair degree of intelli¬ 
gence, together with average spatial ability, form-perception, and finger 
dexterity, it includes electrical wiring and radio repair. 

8. Copy work, 1-X2.2 and 3, 4-XG.56, is performed by persons, the 
majority of whom have average or better verbal ability and clerical 
perception, and fair motor speed and finger dexterity. Occupations are 
both clerical (typist, stenographer) and skilled (typesetter, hand com¬ 
poser). 

9. Alc(hanical woik, 4-X2.103 and 4-X2.104, is characterized by average 
or better numerical and spatial abilities, and fair form perception and 
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manual dexter it). The occupations known to be included are combus¬ 
tion engine and aircraft equipment repairman. 

to. Industrial design, 0-X7.7, involves moderately high numerical, 
spatial, and form-perception abilities, and average or better aiming or 
eye-hand co-ordination. Typical occupations are various kinds of draft- 
ing. 

11. Routine recording. 1-X2.8, involves average clerical perception 
and fair numerical ability, and includes not only routine record-keeping 
jobs but also equipment and material checking. 

12. Business machine operation, 1-X.1, 1-X.2, differs from record 
work (tfy) in that it requires less general intelligence, only average or 
better numerical ability, and, in addition, average or better motoi speed 
and fair finger dexterity; it also requires average or better clerical per¬ 
ception. 1 'he occupations included ha\e the same Part IV DOT classifi¬ 
cation, which suggests that the latter does not make sufficiently refined 
distinctions in this area: category No. 5 is more mental, No. 12 moie 
median ical. 

13. Structural woik, .j-X(>.2, is characterized by fair numerical, spatial, 
and manual abilities: it includes not only structural work with heaw 
metal, but also plumbing and carpenti’s. 

14. Technical assembly, 4-X6.3, requires average or bettei form per¬ 
ception and finger dexterity, and fair spatial and aiming abilities. Types 
of assembh ate: elec ti ical units, mechanical units, and optical units, 
including icpair. 

15. Shaping work, 4X6.3, invokes manual rather than finger dex¬ 
terity, demands no special facility in e)e-hand co-oidination, is otherwise 
like the technical assembly field: it includes grinding and tool dressing. 

16. Visual inspection, 4-X6.38 and 6 X2.38, requires fair form pei- 
reption, whether for close or simple visual inspection. 

17. Routine assembly , 6-X.J.30, is chatacteri/ed by average or bettei 
eye-hand co-ordination and finger dexterity, and lonn peiception. 
Simple elect! ical unit assembly jobs are the only type so far included, 
but no doubt certain nonelectrical jobs will be found in the same cate- 
gory. 

18. This heterogeneous category cannot at present be named. The 
101 union characteristics are aveiage lurm perception and lair manual 
dexterity; the occupations include such metal trades as roller and ex¬ 
truder, range through various stone-setting jobs, and also include visual 
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inspection jobs in metal, leather, lumber, meat-packing, and other in¬ 
dustries. 

19. Classifying clerical woik, 1-X4, requires average or better clerical 
perception and motor speed, includes classifying jobs such as file and 
mail clerks and directory compilers, and other clerical workers such as 
office boys and sorters. 

20. Machine operation, O-X4.4, involves fair amounts of eye-hand co¬ 
ordination, motor speed, and finger and manual dexterity. Occupations 
include a gieat \ariety of machine operating and tending jobs, fiom 
machine sewing through metal polishing, wood sanding, and printing 
press feeding and catching, to pipe bending. 

Reliability. No reliability data have yet been published. In \iew of 
the types of items, the amount of work done with the battery, and the 
qualifications of those supervising it, it seems hardly likely that they aie 
below .Hr,. But this should be made explicit. 

Validity. As the battery was tentatively put into use by the Employ¬ 
ment Sen ice in the spiing of 1917, and is lestricted to that oigani/ation, 
no studies of its validity as such have as yet been published in the Jiteia- 
ture. In \iew of tlu- long-range piogram which pioduced it, it seems veiy 
likely that the General Aptitude lest Battery will prove to have con¬ 
siderable \alidity. As its widespread use in the Employment Sen ice 
makes possible the rapid collection ol data concerning large numbeis of 
people entering many different occupations, objective data concerning 
its value in counseling as opposed to discriminating between pci sons 
employed in \arious fields should be relatively easy to collect. 

Vse of the General Aptitude Test Battery m Counseling and Selections. 
As this battery of tests is designed only for Employment Service use, at 
least pro\ isionalh, there is in one sense no need to discuss thcii use* in 
this treatise. A few points, however, are worth noting for their genetal 
significance. One is that the battery, although designed for counseling 
and standardized with that in mind, could equally well be used loi 
selection pm poses; it is composed of relatively pure tests, factoiialh 
speaking, gives a variety of scores which seem to be of occupational sig¬ 
nificance, and could well be validated against local criteria in a student 
or employee selection program. Secondly, as the tests have all been stand¬ 
ardized on the same population, that upon which the standard scores arc 
based, have norms lor a variety of occupations expiessed in the cut-oil 
scores, and are to be administered to additional occupational groups lot 
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norming purposes, the battery is potentially the most useful instrument 
of individual diagnosis which has been developed. It should almost cer¬ 
tainly prove extremely valuable in colleges, guidance centers and em¬ 
ployment services, in dealing with young adults. If the normative data 
were extended downward by the establishing of age and grade norms (a 
much easier task than extending grade norms upward), and by ascertain¬ 
ing the effect of maturation and expedience on the test scores, it could 
become an extremely uselul instrument in the schools. It is for these: 
reasons that the battery has been described in so much detail in this text, 
even though most users of this book do not now have access to it. It is 
to be hoped that, tentative and incomple te though it is, it has established 
a pattern which will be further developed and become the pattern lor 
the future. 

The Differential .latitude 'Tests (Psychological Corporation, 1 17) 

This battery of tests was developed bv Bennett, Seashore*, and Wesman, 
in response to widespread feeling among \(national ps\chologists and 
counselors that a major delect in cut rent testing progiams is the lack ol 
a uniform baseline for the \arious tests which aie used with a given 
student or client (Manual: A-3). We have seen, for example, that the Re¬ 
vised Minnesota Paper Form Board has norms which aie based on differ¬ 
ing groups in a few localities, and that the Bennett Mechanical Compre¬ 
hension Test has a totally different and equally limited base. A student 
may be at the (i^tli percentile when compared to libetal arts college fiesh- 
men on one test, and at the 55th on another, but actuallv have mote 
ability of the type measmed by the second test: the seemingly lowei scote 
may be due to differences in the normative groups. It is onlv when the 
tests in a battery have been standardi/ed on strictly compatable groups, 
if not the same group, that one can effectively study aptitude 01 inlet pret 
differences within individuals. 

Other needs also contributed to the developme nt of this batteiy. One 
was the improvement ol statistical procedures which made possible the 
construction of tests which eflectively measure narrower aspects of ability 
than general intelligence. We have already seen the development ol 
quantitative and linguistic scores lor the A.G.E. Psychological Examina¬ 
tion, the Weclislcr-Bellevue Scale, and other modem substitutes lor the 
undifferentiated tests ol the times of Binct and Otis, and the furthei 
development of factor scores by Thurstone irr the Primary Mental Abili¬ 
ties Tests. Still another was the time lactor, for it is important that a 



STANDARD BATTERIES 


369 


comprehensive battery be administrable in a reasonably brief period if 
educational and occupational norms are to be obtained for all tests from 
the same subjects. It is a sign ol the times that both the United States 
Employment Service and the Psychological Corporation have moved 
simultaneously to meet these needs. T he American Institute for Research 
is preparing an integrated battery ol its own, also for use in guidance; 
Guilford (320) has released a similar battery; and it seems likely that 
other test publishers will be forced eithei to follow suit in due course 
(an expensive piocess) or to coniine their energies to speciali/cd fields 
such as achievement, special talents (aitistic, musical, manual), interest, 
and personality And even some oi these will probably be removed hom 
the list as standaid batteries are impioved, for theic is evidence which 
suggests that paper-and-pencil tests ol manual dexterities and intciest 
inventories (and perhaps tests) can be developed which will be much 
more valuable il used as parts of an integrated battery. 

Applicability. The battery was designed lor use with high school stu¬ 
dents, inc luding 8th grade boys and git Is. Items were devised lor and 
retained on the basis ol their suitability for this age and ability range, 
and the time limits and nouns are based upon the performance ol sam¬ 
ples of high school populations. They may therefore be considered 
extremelv effective at these levels. No attempt was made* to make the tests 
applicable to college students or adults, although use in pci .sound selec¬ 
tion was envisaged (Manual: A-i, A-r,) and the- items may well be suitable, 
but the fact that the gtade norms increase annually from grade 8 through 
12 shows that special norms would be nccessaiy. As age norms have not 
yet been provided, and no analysis has been made of the eflects of pio- 
gressive elimination in high school on the* sample-, it is still impossible 
to draw' anv conclusions concerning the development cal these abilities 
from the pieliminary work with these tests: seniots mav make higher 
average scores because they have lived and studied one year longer than 
juniors, or because the) have lost their less able classmates bv the- wayside-. 
Be this as it mav, the- development of college or adult nouns has one- 
possible drawback in the- ceiling ol the tests: having been designed lot 
high school students, they might not permit the most able college stu¬ 
dents and adults to show the full extent ol their abilities. 

Content. The Differential Aptitude Tests consist of eight tests designed 
to measure eight different abilities. Some of the abilities are aptitudes in 
the stricter sense ol the term (Verbal Reasoning, Numerical A hi 1 it v, Space- 
Relations, and perhaps Abstract Reasoning), otheis are lactorially less 
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pure (Clerical Speed and Accuracy and Mechanical Reasoning) but can 
be treated as aptitudes, while still others are proficiencies (Language 
Usage: Spelling and Sentences). The last-named are, however, sufficiently 
basic forms of achievement to be used effectively as indices of promise. 
Because of the excellent descriptions in the manual, the following para¬ 
graphs are in part abstracts of the manual. 

77 /e Verbal Reasoning Test attempts to measure ability to generalize, 
to think with words a la Thurstone (V). It consists of verbal analogies, in 
which the first member of the first pair and the second member of the 
second pair have been omitted from the stem and must he selected from 

two sets ol items with four choices each, thus:___ is to x as y is to 

_. Analogies were used because they have proved to be one of the 

best types of leasoning test items, and the form chosen is highly reliable, 
versatile, and lends itself to complexity without resort to esoteric terms. 
Because of this latter fact, the vocabulary is relatively simple, the content 
familiar, and complexity is a function of the reasoning processes involved. 

The Numerical Ability Test is designed to measure understanding of 
numerical telationships and facility in handling numerical concepts, 
another of Thurstone’s factors (N). As the manual points out, the items 
arc cast in the form usually refened to as “arithmetic computation” 
rather than “arithmetic reasoning:” the reason given is that language 
problems are thus avoided, and that complexity was attained by the 
numerical relationships and the processes to be used in the problems. 

The Abstract Reasoning Test attempts to measure reasoning without 
the use of words (Thurstonc’s R). Problems arc of a spatial type made 
familiar by the* A.C.E. Psychological Examination, and require finding 
the principle underlying a series of changing geometric figures. 

The Sprue Relations Test (Thurstone’s S) is the most ingenious in the 
series, although embodying familiar principles. These are ability to vis¬ 
ualize a constructed object from a pattern (structural visualization in 
three dimensions), and ability mentally to manipulate a form in order 
to judge its appearance after rotation in various ways. By combining 
these two principles in items which require the mental folding of cut 
or partly shaded patterns a test of spatial visualization has been clcvel 
oped which promises to be superior to any so far developed. 

The Mechanical Reasoning Test is another form of the familiar Ben 
nett Mechanical Comprehension Lest. The mechanical principles arc- 
illustrated with pictures of familiar objects, but care was taken to avoid 
textbook illustrations. 
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The Clerical Speed and Accuracy Test is designed to mcasuie speed of 
response to numerical and alphabetical symbols. Although piesumably 
a substitute for the Minnesota Clerical Test, it dillers considerably from 
the latter in its mechanics, and also seems to difler in Us factorial composi¬ 
tion, for letter and number-letter combinations aie substituted (or the 
names used in the older test. The examinee finds the underlined combina¬ 
tion in each row of a block of symbols, then marks the same combination 
(differently placed) in the same row of the same block 011 the answer sheet. 
Intelligence plays a less important part in this task than in the Minne¬ 
sota Names Test. 

The Language Usage Test contains two parts, Spelling and Sentences. 
In the former, each word is marked as spelled either right or wrong; in 
the latter each sentence is divided into parts, to be marked according to 
their correctness. The types are familiar, the items chosen by established 
scientific procedures. 

An attempt was made, in drawing and printing the items in these tests, 
to make them sulliciently large and clear so that visual acuity would plav 
no part. Inspection of the items does suggest that they are free from some 
of the delects which can be noted in certain othci tests involving mechan¬ 
ical objects, geometric figures, and other drawings in which details might 
be obscure or irrelevant differences slight and confusing. 

Although the test authors point out that the Differential Aptitude 
Tests were designed, not to measure all known and mensurable aptitudes, 
but rather to measure a number of important variables which have 
meaning for vocational counseling and selection and which can In 
assessed in a reasonable period of time, one cannot help but check the 
aptitudes tapped b) these tests against those assessed by the USES General 
Aptitude Test Battery and isolated by various lac tor analysis studies 
( 7 ‘hoASd)- The Verbal, Numerical, Spatial, Abstract Reasoning, and Cler 
ical Speed and Accuracy Tests clearly correspond to the- verbal, numerical, 
spatial, reasoning, and perceptual fac tors isolated In Thin stone and b\ 
Shartle and associates, and measured by the General Aptitude Test Bat¬ 
tery. The Mechanical Reasoning Test has no counterpart in the GATE 
presumably because it taps a composite of factors rather than one factor, 
neither do the Language Usage Tests, which are achievement measures. 
On the other hand, Thurstone isolated a memory factor (not rcliablv 
measured) and the GATE provides measures of eye-hand co-ordination, 
motor speed, finger .dexterity, manual dexterity (the last two require ap¬ 
paratus tests), and distinguishes between loim and clerical perception. 
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Tliis suggests that Bennett, Seashore, and Wesinan have gone further 
in the direction of measuring what counselors look for and what has 
proved to have validity than did Shartle, Dvorak, and associates; and, 
conversely, that the latter have attempted more consistently to make use¬ 
ful the findings of the factor theorists. Given adequate norms and valida¬ 
tion data, the USES policy may prove wiser in the long run; until then, 
the Psychological Cotporation’s policy of providing measures of types 
which have known occupational validities may be sounder. 

Administration and Scoring. The eight tests are printed in seven book¬ 
lets (the two Language Usage Tests are in one booklet), making possible 
administration of any of the tests in any order desired. Time limits are 
such that any test can he given in one class period; they vary from six 
minutes to 35 minutes. 1'otal testing time is three hours and six minutes. 
The manual recommends that the tests be given in an order which will 
hold interest and avoid monotony, and suggests two arrangements, one 
of three and one of two testing sessions, which are not cpiitc identical. 
This raises the interesting question ol the possible cflcct on profile scoies 
of testing some students with one sequence, some with another, and of 
testing sonic* students in lew sessions, some in several. The test authois do 
not mention this pioblcm, which may not be an impoitant one, but until 
it is demonstrated to have no effect it is probably wise to adopt a s< j 
sequence and spacing of tests and to follow it ligidlv, thereby making 
all local scoies comparable with each other if not actually with the na¬ 
tional norms (it is not clear just what sequence and spacing wcie used 
in gatheiing norms, nor even that the procedure was standardized in 
this respect). Answers aie recorded on IBM answer sheets, making possi¬ 
ble either hand-stencil or machine scoiing. The manual contains unusu¬ 
ally complete suggestions lor efficient test administration and scoiing, 
from advance arrangements to a summary tabic* ol scoiing inlonnation. 
incorpoiating the best experience of the large-scale testing programs of 
recent vears. 

Xorms. Norms aic available for cadi grade from Sth through 12th, 
and lor cadi sex, for both forms of the test. They permit the conversion 
of raw scoies into percentiles, which were adopted instead of standaicl 
scores because of their current widespread use; the profiles permit con¬ 
version into approximate standard scores, and such a system is to be 
made available in due course because of that system’s more accurate rep¬ 
resentation of individual difleiences. The students on whom the tests vvei e 
standardized weie enrolled in schools scattered throughout the 1 Eastern 
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and Midwestern states; Western and Southern norms are in preparation; 
industrial and business norms will be provided routinely to manual own¬ 
ers as research projects make them available. The Eastern and Midwestern 
norms are based on 30 school systems, ranging from Yorktowu Heights (a 
small northern Westchester County suburb of New York City) and Glou¬ 
cester (a small Massachusetts fishing and resort city) to Ann Arbor (the 
University of Michigan’s college town) and St. Paul (Minnesota’s indus¬ 
trial city). In some communities all pupils in all five grades were tested; in 
others, representative samples (as judged by the local research director) 
were tested. Form A was standardized on the largest groups; these range 
in numbers from 382 for the 12th grade boys to rybi lor* the 9th grade 
boys, and from 578 12th grade girls to 1 (>^2 9th grade girls. For a pre¬ 
liminary standardization such regional coverage and numbers are almost 
uniejue; they appear to be* such as to make possible the use of the tests at 
once in Eastern and Midwestern communities, fudging by the results of 
other tests these norms will be somewhat high for Southern states, and 
somewhat low for the West Coast, but other regional norms to be pub¬ 
lished will soon, no doubt, be available. It is to be hoped that curricular 
norms will become available, and that college freshmen norms, based 
on homogeneous and wcdl-described types ol colleges, will also be com¬ 
piled and published. 

Standm dilation and Initial Validation. What has previously been 
said concerning the* content, the development of norms lor this battery 
of tests, and data in the subsequent paragraphs on its reliability, conveys 
an adequate idea ol the work which was done in standardizing these tests. 
'The types ol items to be included were decided upon on the basis ol fac¬ 
tor analysis and validation studies carried out by other psychologists with 
other tests, d ire test items were tried out in preliminary studies and the 
tests were administered for standardization purposes only when they 
sec ured administr able. Care was taken to obtain large sarrrples of students 
at each appropriate grade level and in representative communities. Fire 
reliability ol each test was computed and, with one limited exception, 
found adequate lor individual diagnosis (see below). Finally, the inter- 
correlations ol the tests were* obtained. These latter ranged, for Battery A 
(boys), from .06 (Mechanical Reasoning and Clerical Speed and Accuracy) 
to .62 (Verbal Reasoning and Language Usage: Sentences); data for girls, 
and for Battery B, weie approximately the same. The median interccorre¬ 
lation for Form A tests is .925. These intercorrelations are not much 
higher than those of the Primary Mental Abilities Tests, after allowance 
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is made for the achievement (Language Usage) and composite (Mechani¬ 
cal Reasoning) tests, lor which the two highest corielations with other 
tests were obtained. Knowledge of the educational and vocational pre¬ 
dictive value of similar tests which have already been discussed (the 
Primary Mental Abilities Tests, A CT. Psychological Examination, Min¬ 
nesota Paper Form board, Bennett Mechanical Comprehension Test, 
Minnesota Clerical Test), combined with the proved reliability and te- 
lative independence of these tests suggests that studies using external 
criteria should demonstrate considerable validity lot these tests. 

Reliability. Particular care was taken, in establishing the reliability of 
the Differential Aptitude Tests, to avoid the common delect of tests 
with pait sc cues, that is, reliability of the total score but insufficient 
reliability of thepait scores for individual diagnosis. Homogeneous gtmips 
were used, to avoid the spuriously high coefficients which are yielded by 
heterogeneous groups. Split-half reliabilities were computed for all but 
the Clerical Speed and Accuracy Test, for which, as a speed test, that 
technique is not suited; instead, alternate-form reliability was ascertained 
The Form A reliability coefficients for cjfio boys range horn .Sr, (Mechan¬ 
ical Reasoning) to .qg (Space Relations); for tobj gills they tanged 
from .71 (Mechanical Reasoning, a tvpe of test which generally has little 
value for git Is) or .Sf> (Numer ical Ability, the second lowest lot gills) to 
.()2 (Language Usage: Spelling). For boys, then, all ol the tests in Bat 
tery A have quite adequate 1 (‘liability; lot gills, all those which aie likely 
to be useful have equal reliability. Data for Battery B are about the same, 
this form ol the Mechanical Reasoning Test having been revised and 
ini pro v eel. 

Validity. The Differential Aptitude Tests being recently published, 
thetc has been little time for the carrying out of studies of their validity 
in 1 elation to external criteria. The authors felt that the known signifi¬ 
cance of the abilities measured, combined with the internal evidence of 
validity, was sufficient to justify making the test available at this stage ol 
development (Manual: E-a). As they have committed themselves to an 
extensive ptogram foi the validation ol the battery against educational 
and occupational criteria (Manual: E-t), and as the eatly publication ot 
a reliable test has been demonstrated to speed up its further validation 
by other investigators (e.g., Kudei’s work, Ch. 18), this would seem to be 
quite justifiable. A supplement to the manual now includes a large num¬ 
ber of validity coefficients, based on the high school grades of norm 
groups. 
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Dse of the Differential Aptitude Tests in Counseling and Selection. 

1 he preliminary evidence concerning the development and standardiza¬ 
tion of the DAI’ battery suggests that these tests measure a number ol 
variables which have frequently been found to have vocational signifi¬ 
cance. For an understanding of the development and vocational signifi¬ 
cance of the traits measured by these tests, the chapters dealing with 
similar specific tests should be studied. 

In schools and colleges, when clinical counseling is to be done, that is, 
when the objective is the study of a counselee in terms of his psychological 
make-up and its general educational and vocational implications, the 
battery should prove useful. When, however, comparisons need to be 
made with prc-occupational or occupational groups, the lack of occupa¬ 
tionally differential norms renders this battery temporarily useless. As 
theie is every reason for believing that curricular and occupational norms 
will be developed, counselois in schools and colleges may want to use 
this battery for clinical counseling, developing their own curricular and 
vocational norms as pai t of their follow-up woik. 

(Audance and employment centos which habit nails carry on norma¬ 
tive studies may a 1 so find it worth their while to use this batten of tests 
in clinical counseling, supplementing it with others which have occupa¬ 
tional norms when such data are really needed. Other tests, such as those 
of manual dexteritv, may be needed in any case to round out the picture, 
together with personal data obtained in interviews. If the battery is used, 
it should he only with a definite co-ordinate resealch program in mind. 
This can be materially aided when the center works co-operativelv with 
business and industn in employee selection programs. 

hi business and industry, even more than in guidance work involving 
clinical counseling, the gathering of local norms and validation against 
local criteria should piecede the use of the results of these tests for selec¬ 
tion purposes. Validation for selection is so much easier than validation 
for counseling, and the accutacv of predictions is improved by so much 
greater a degree, that to adopt any other policy is to be guilty of gross 
negligence. 



CHAPTER XVI 


THE NATURE OF INTERESTS 


INTERESTS have piobably leceivcd more attention from vocational 
psychologists during the past genciation than any other single type ol 
human characteristic, including intelligence, aptitudes, and personality 
traits. In contrast with no books and only a lew monographs (rjip,,823,887) 
published in America 011 intelligence and vocational adjustment, and 
two text-books (385.94) and lour significant monographs (588,r,8c).1.^01, 
5TT>) 011 aptitudes and vocational success, there have been two scholaily 
books (21:7,775), at least lour significant monogiaphs (27c), 1 jr )t 189.7(13) 
and a number of important reviews ol reseaich published in the journals 
(77,11 1,798.(Son), all dealing with the nature and role ol interests. 

Psychologists who ha\e had other specialties have paid much more 
attention to other t\pes of characteristics, Allpoit (NcjS) and T horndike 
(830) being among the few to study inteiests, through the Allport-Yernon 
Study of Values (sec below) and various introspective techniques. Clini¬ 
cal psychologists have tended to devote* their eneigies more to the meas¬ 
urement of intelligence and to the diagnosis of mental defect and 
malfunctioning, students of individual clilleiences have focused on 
abilities, and personologists have been challenged more by problems of 
the organization of personality (12,7 ]3.554) and by needs and drives 
(557)- The genetic psychologists are pet haps an exception, as the y have 
paid some attention to the development of play inteiests, in a type* of 
study illustrated by those of Lehman and Witty (qfn \h 1). 

It is vvoilliv of note that, when these differing approaches to the 
psychology of individual differences have briefly met, the result has 
more often than not been confusion. T hus Lehman and Witty loosed 
a broadside at vocational interest inventories, dec lying their use in 
counseling on the grounds that interests are unreliable (jf>3), but their 
evidence to that effect was based on expressions rather than on inven¬ 
tories of interests. This is an impoitant distinction which will shortly 
be made clear, because of the accumulation of unsynthesi/ccl material 
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on the nature and development of interests, these topics are dealt with 
at length in this section. I he role of interest in vocational adjustment 
will he considered later, in connection with the validity of specific in¬ 
struments. 

Definitions. There have been four major interpretations of the term 
interest, connected with as many different methods of obtaining data. 
In an attempt to clarify thinking in this area the writer has (800) classi¬ 
fied them as expressions, manifestations , tests, and im>entones of interests. 
Each of these is taken up in turn, to provide a framework for the sub¬ 
sequent discussion of the measurement ol interest. 

Expressed interest is the vet bal profession of interest in an object, 
activity, task, or occupation; Fryer (277) called it specific interest. The 
client simply state's that he likes, is indifferent to, or dislikes the activity 
in question. There has been relatively little research in this area since 
Fiver's (277) detailed re view in 1931, as shown in subsequent reviews by 
Cartel (1 j5) and by berdie (79). The conclusion to be drawn from the 
later reviews is the same as that drawn by Fryer: the expressed or “spe¬ 
cific” interests of children and adolescents are unstable, and do not 
provide useful data for diagnosis or prognosis. For' adults however, the 
picture is somewhat mote optimistic, for Strong (7751(157) h* iS shown that 
the constancy of responses to the joo items in his inventory ranges from 
52.0 percent for high school juniors after six years (reflecting the in¬ 
stability of expressions of interest just referred to) to 82.8 percent for 
women phvsicians after one* dav, showing that even specific or expressed 
interests are rather stable in adults over a short period. The importance 
which may be attached to expressions of specific interests clearly varies 
with the 1 maturity ol the client. As Gilger (290), Lurie (jpo) and Tiow 
(87r } ) have shown, it also depends upon the ways in which the questions 
are phrased, for some questions corner ning vocational interest arc so put 
as to elicit information concerning vocational choice, some to ascertain 
vocational preferem cs, and some to evoke vocational fantasies. The 
degree of realism represented by the expression of interest varies with the 
type of question asked. Studies of the relationship of expressed prefer¬ 
ences to scores on Strong’s Inventory, discussed below, illustrate this fact. 

Manifest interest is synonymous with participation in an activity or 
an occupation. Objective manifestations of interest have been studied in 
order to avoid the subjectivity of expressions or to avoid the implication 
that interest is something static, dims Kitson (128: Ch. 8) has urged 
that the verb “to be interested” should be used, indicating that a process 
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and activity are involved. In this approach it is assumed that the high 
school vouth who was active in the dramatic club lias artistic or literary 
interests, and that the accountant who denotes two evenings per week 
to building and operating a model railroad system is interested in me¬ 
chanics or engineering. It is generally appreciated that such manifest 
interests are sometimes the result of interest in the concomitants or 
by-products of the activity rather than in the activity itself. The high 
school actor may have merely been seeking association with others, which 
he may later need less or obtain by clifleient means. In other cases the 
opportunities for the manifestation of an interest mav be limited by the 
environment or by financial considerations, so that an expressed interest 
has no manifest counterpart. Foi these reasons, manifest interest has 
not been used as a predictor of interest in many studies, although it has 
often served as a criteiion, the reasoning being that anything as dynamic 
as interest should in most cases find an outlet. 

Tested i)ite)est is here used to refer to inteiest as measured bv objective 
tests, as diflerentiated from inventories which aie based on subjective 
self-estimates. It is assumed that, since inteiest in a vocation is likely to 
manifest itself in action, it should also result in an accumulation of 
relevant information. Thus interest in science should cause a pci soil to 
read about scientific developments, whether in a science emuse or in the* 
daily paper, and to acquire and ictain more inhumation about science 
than would other people. Fryer (277did j ff.) has lewewed the attempts 
which were made by O’Rouike. 'loops, lUutt, McIIale and others dming 
and after World War I to measuie interest by means of the* amount and 
type of information retained, and has pointed out that these were not 
followed up because of the* cumbersomeness of memoiy and inhumation 
tests. 

With the improvement of testing and statistical techniques which 
subsequently took place, however, inteiest in the development of interest 
tests revived. At this time Greene published his Michigan Yocabulaiv 
Profile Test (308), measuring interest through specialized vocabularies. 
The Co-operative Test Service brought out a general information test 
which Flanagan (262) described as a measure of inteiest in several areas. 
The writer and his students at Clark University (80(1,805.574) began a 
series of investigations designed to de velop an attention or recerit-mc inoi \ 
test of interest in vocational activdies. During World War II the Aviation 
Psychology Program brought together several psychologists who had been 
working along these lines (R. N. Hobbs, R. R. Blake, I). E. Super, J. C 
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Flanagan, and F. B. Davis). Their efforts resulted in the development of 
a General Information 'Test which gave differential scoies lor pilots, 
navigators, and bombardiers and which proved to be the most valid 
single test in the Air Force’s selection and classification battery (aG 1,214). 
The writer constructed a similar test for the American Institute for 
Research which has been used in the selection of pilots for commercial 
ail lines. Other Chilian applications are also being made, and the tech¬ 
nic] ue will in time probably prove to be generally useful lor selection 
and counseling. 

Inventoried Interest is assessed by means ol lists of activities and oc¬ 
cupations which beat a superficial lesemblance to some questionnaires 
lor the stud) ol expicsscd interests, for each item in the list is responded 
to with an expiession oI piclcicncc. The essential and all-impoi tant 
difleience is that in the case ol the inventory each possible response is 
given an experimentally determined weight, and the weights correspond¬ 
ing to the answers given by the person completing the inventory arc 
added in order to vield a score which lepresents, not a single subjective 
estimate as in the case ol expiessed interests, but a pattern ol interests 
which leseaich has shown to be rather stable. The apparent!) logical 
objection that no statistical combination ol unstable elements can yield 
a stable total is met bv Sttong’s study (775:871) of the effect of changes 
ol 1 espouses to specific items on inventory scores: although changes of 
expulsions of liking or disliking of as many as 125 of his 400 items were 
found, these shifts had no appreciable effect on scores for occupational 
inteiests. The reason for this is that shifts in one direction are balanced 
b) shifts in the other direction, the underlying pattern or trend of inter¬ 
est being constant. Strong’s work provided a foundation for a great many 
studies in the psvchology and measurement ol interest, and made pos¬ 
sible tlu* development of piactical instiuments for use in counseling and 
selection. He has suininai 1/ed most ol the significant research with his 
Vocational Inteiest Blank in a volume (775) which is one of the classics 
in the field of measurement. Other inventories have been developed by 
Kuder (j pi), Garretson and Symonds (279), Dunlap (219,713). and others; 
some of these are discussed later in this chapter. 

'File term inteiest is also used to convey other concepts, the most 
relevant of which are de^iee of interest or strength of motivation and 
drive or need. \ he former needs no discussion, as it is a matter of degree 
rather than of kind: when it is said that someone is vitally interested 
in attaining a goal, the statement is one concerning the degree of some 
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underlying (inventoried) interest or the strength of some drive. The 
concept of interest as drive docs require discussion, for when it is said 
that an individual is interested in winning friends or in gaining prestige, 
the type of interest referred to is not covered by any of the concepts so 
far discussed. Interests or drives of this type are of a different and more 
fundamental order than either specific or underlying interests; they 
constitute a deeper layer of personality. Unlike interests, which are some¬ 
times included under the heading personality and sometimes not, drives 
or needs are generally considered to he one ol the central aspects of 
personality. They are theieforc discussed, together with their vocational 
significance and methods of measuring them, in the next chapter. 

Types of Interests. As in the case of intelligence testing, progress in 
the measurement of interests was first made possible by a shotgun ap¬ 
proach which was concerned less with the specific nature ol that which 
was being measured than with the tact that it could be measured, 'l'he 
all-important discovery made by Strong and his students was that the 
intciests of men in a gi\en occupation, e.g., engineding, were different 
from those of men-in-gcneral (775: Ch. 7; 171). It was only after scales 
had been developed for the measurement of the inteiests of men in a 
number of occupations that fac tor analysis (8;;(>) and item analysis (.] jfi) 
revealed the nature of these interests. For this reason the logical sequence 
of topics which follows is not the histoiical order in which discoveries 
wet e made. 

Interest factors were fiist studied by Thurstone (HqG), who applied 
factor analysis to 18 occupational scales of the Strong Vocational Interest 
Blank. Strong (775: Ch. 8 and i.j) later made stweial factor analyses, in 
the last of which he used data from 3b occupational scales, first without 
rotating the axes (like Thurstone) and then by rotating them. For 
clarity’s sake, the Jesuits of these thiee analyses aie piescnted in Table 
27, together w T ith data from three other studies and a logical synthesis 
of the findings of all six studies. 

Allport and Vernon developed their Stud) of Values as a measure of 
the values postulated by Spranger. Lurie (189) also devised an instru¬ 
ment for appraising these values and, unlike Allpoit and Vernon, sub¬ 
jected it to factor analysis. These two lists of factors are also presented 
in Tabic 27. 

Further evidence concerning the nature of interest factors is provided 
by Kuder’s w r ork with his Preference Record (described below). This 
inventory gives scores for nine types of interests, 1 which cannot be called 

1 A tenth, “Outdoor” interest, has been added. 
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Table 27 

INTEREST FACTORS REVEALED BY SIX STUDIES AND LOGICAL. SYNTHESIS 


V hur stone 

A llport -1 'ernon Lurie 

Strong 

Kuder 

Synthesis 




l hirotated 

Rotated 



Sc icncc 

Theoretical 

Theoretical 

Scicnc e 

Science 

Scientific 

Scicntifu 

People 

Social 

Social 

People 

People 

Social- 

Social- 






Service 

Welfare 

Language 



Language 

Language 

Literary 

Literary 




T hings vs. 

I hings vs. 

(Mechanical) 

Material 




People 

People 



Business 

f Economic 1 
(Political J 

Materialistic 

Business 

[System 

' (Clerical) 

! Computational 

) System 

) 




(Contact 

Pcrsuasi vc 

Contact 


Aesthetic 




Artistic- 

Artistic 


Religious 

Religious 



Musical 

Musical 


factors in the statistical sense of the term as they were not isolated by 
laetoi anahsis methods, hut which amount to about the same thing as 
they aic* based on item anahsis and are therefore internally consistent 
and mutualh independent Ruder originally developed se\en scales by 
this method: these ate listed in Table 27. He later added two more, 
which ate listed in parentheses because, unlike the others, they had 
substantial correlations with other keys: mechanical interests correlated 
. jor, with scientific, and Helical .70 with computational. 

FinaJb, alter a studs of the* lac tots appearing in columns one through 
six of Table 27, togethei with the literature upon which they are based, 
the waiter has developed the* list of factors appearing in the last column 
of this table headed “.Synthesis/* The naming of statistically isolated 
factors is a highly subjective and aibitrary process. For example, three 
authoiities have- vaiiously named the same factor “interest in male 
association,” “interest in older or systematic work,” and “non-piofes- 
sional interests” (777,: d> j—i(»<>). In one sense*, therefore, the writer is 
justified in attempting a synthesis of the findings of various investigators 
and in applying his own names to the various categories; in another sense, 
the whole piocess of naming interest factors is open to criticism as a 
potentially misleading one. It can he justified, perhaps, on the grounds 
that a cautiouslv named concept, cautiously used, is better than no 
concept at all: it merely behooves the name-giver to point out tlie need 
lor caution. 

Fable 27 brings out complete agreement on the first interest factor, 
the scientific , which may be defined as an interest in knowing the why 
and how of things, particularly in the realm of natural science (only the 
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Allport-Vernon attempts to assess interest in scientia in the philosophical 
sense). There is agreement also on the second factor, interest in social 
welfare or in people for their own sake. The third factor is not provided 
for by Allport and Vernon or by Lurie, who were limited by Sprangcr’s 
postulates; but as factor analysis, like qualitative analysis in chemistry, 
can isolate only the elements which were originally put into the com¬ 
pound, the lack of positive findings in these studies can be disregarded. 
The Thurstone-Strong-Ruder data can be accepted as evidence of the 
existence of a liteiary interest factor, consisting of interest in the use ol 
words and in the manipulation of verbal concepts. A iourth factor, again 
not revealed by the Spranger-inspircd studies, by the Thurstone analysis 
(which was presumably based on too few occupations), nor by the Ruder 
procedure (though perhaps partly co\ered by his mechanical scale), but 
found in Strong’s two analyses, might best be called the material or 
concrete, although Strong named it “tilings vs. people’’ or, on the basis 
of negative loadings in the literary and linguistic occupations, “lan¬ 
guage.” J he writer prefers Kitson’s term “material” because the occupa¬ 
tions in which it has heavy positive loadings tend to invoke working 
with tangibles. Carpenter, mathematics-science teacher, fanner, printer, 
production manager, engineer, chemist (these last two ha\e heavier 
scientific loadings), and even policemen and accountant may be included 
in this category, since they are concerned, respectively with the protec¬ 
tion and the management of property. The fifth factor, one concerning 
which there is considerable agreement, is the systematic or pci hups 
record-keeping; it emerged most clearly in Strong’s moic refined analysis 
but he refused to name it, although he states that it might be called the 
C.P.A. factor. Ruder’s computational inteicst appears to be similar, and 
it is probably covered also by Thurstonc’s business. Allport and Vci noil’s 
economics, and Lurie’s misnamed philistine* (or matei ialistic) \ allies. 
The sixth, or contact factor, is also probably included in the* too com¬ 
prehensive complex of factors called business and economic in the 
Thurstone and Sprangcr categorizations, refined by Strong’s final anah 
sis. It is the second factor which Strong thought it wise to refrain bom 
naming until more occupations were found to ha\e loadings of it. It 
seems to involve interest in meeting or dealing with people not foi 
their own sakes but for material gain. Ruder s persuasive interest appears 
to be identical with it. Finally there arc the artistic and musical factors, 
the former agreed upon by Allport and Vernon and by Ruder, and sug¬ 
gested in another study of Thurslone’s (837); the latter isolated bv Ruder 
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only and therefore quite tentative, although the failure of Strong and 
Thurstone to find such a factor proves little in view of the presence of 
only one musical occupation in their lists. 

Occupational differences in patterns of interests were, it h^s already 
been pointed out, the basic discovery which made possible subsequent 
studies of the nature and role of vocational interests. Beginning his work 
in interest measurement as a member of the outstanding group of ap¬ 
plied psychologists who were assembled at the Carnegie Institute of 
Technology alter World War I, Strong continued to experiment with 
the vocational interest inventory technique after he joined the faculty 
at Staniord Univeisity, and there succeeded in establishing the fact that 
the inventoried interests ol men who are engaged in different occupations 
differ significantly from those of men-in-general (775: Ch. 7). 

Some occupational groups, however, were not distinguishable from 
men-in-geneial; Strong’s early attempts to develop scales for executives 
and for teacheis failed (775:20,161 ff.), and in his later studies of the 
interests ol public administrators (779,780) he encountered difficulties 
which were essentially similar. 1'he reason for the failure to establish 
patterns of interest peculiar to executives and teachers, and for the lack 
of validity of the public administrator scale for some groups of adminis- 
natots, rnav lie in the fact that these are not truly occupational groups, 
stiong’s wotk in the development of teachers’ scales has shown, for 
example, that men social-studies teachers seem to be primarily social- 
welfare workers (r with YMCA secretary — .87), mathematics-science 
teachers resemble skilled tradesmen (r with carpenter — .68, with printer 
- .72), and the correlation between the interests of men in these two 
types of teaching occupations is practically zero (r —.13), as shown in 
Strong’s table of intercorrelations (775: opposite 716). Similarly, the 
executive group was made up of men who weic essentially engineers, 
lawyers, or other specialists (770), and the public administrators also 
included main men who were professional men at heart but who had 
been given administrative responsibility (779). 

The occupations which were differentiable on the basis of interest 
patterns of the men engaged in them could be grouped. Strong found 
(775: Ch. 8), according to the degree of similarity which existed between 
their interests. Some of the occupational interest scales were positively 
intercorrelatecl, others negatively, in varying degrees. Strong therefore 
grouped the various occupational scales on the basis of these intercorrela¬ 
tions, establishing .60 as the minimum intercorrelation neccssarv for two 
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occupations to he assigned to the same family. The resulting families may 
he characterized as follows: 

Biological Science Occupations e.g. Physician 

Physical Science Occupations e.g. Chemist 

Technical Occupations e.g. Printer 

Social Wellare Occupations e.g. V Secretaiy 

Business Detail Occupations e.g. Accountant 

Business Contact Occupations e.g. Tile Insurance* Salesman 

Linguistic Occupations e.g. Lawyer 

The terminology is essentially Dailey’s (189), hut not that used hy 
Strong, who has been extremely reluctant to name gioups which seem at 
all heterogeneous. He characterized the* second group as “mathematics 
and physical sciences,” the fouith as “handling people lor their pie- 
sumed good,” the filth as “office,” the sixth as “sales,” and the seventh 
and last as “linguistic” (775: if>o), hut It'll that the presence of such 
vocational groups as artists and architects in tlu* fust or biological science 
group (r artist-physician = .79) makes it difficult to name, and that avia¬ 
tors, carpenters, mathematics-science teachers, and policemen make odd 
bedfellows in the so-called technical group, even though their inter¬ 
correlations with the printer scale aie .hr,, .73, and .72 lespectively. As 
Strong points out in his discussion (775.-159— 1 Go), the* sub piolessional 
technical group appeared originally as part of a general scientific group 
which included also the biological and physical-science occupations, but 
which broke up into the three scientific or near-scientific groups ol the 
current classification when more occupational scales were devised. As 
additional occupational scales are developed, it is probable that the so- 
called technical group will further subdivide, an hypothesis for which 
Strong provides some important substantiating data in the analysis of 
the effect of the point of relerence (775’ Ch. 21 and 22). 

Another group which seems likely to subdivide as more keys are added 
is the contact or sales group. The danger involved in either of these 
names is brought out by the fact that public utility salesmen belong 
more in the business detail group (r office worker — .b9) than in the 
contact family (r life insurance salesman — .39). Strong therefore feels 
that there will in time be a new sales group, consisting of house-to-house 
salesmen. The classification of occupations on the basis ol interests must 
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therefore be considered tentative, and one must not let the very natural 
desire to give names to categories lead to the making of false generaliza¬ 
tions. 

The data on differences between kinds of teachers and salesmen raise 
a question concerning other occupations. They prompt one to ask 
whether a sufficiently refined analysis would reveal similar diflerenccs 
between mechanical, electrical, civil, and chemical engineers, for ex¬ 
ample, or between various types of secretaries in the YMCA. Using his 
standard techniques of comparing one occupational group with inen- 
in-geneial (described in the section dealing with the inventory), Strong 
found no differences between the interests of the various types of engi- 
neeis, the correlation between civil and electrical engineer, lor example, 
being .8b (775:1 iH). lie obtained similar results with scales lor the inter¬ 
ests of YMCA general and boys’ wotk secretaries, only the physical 
directors being distinct enough (r combined Y secretary scale — .74) to 
warrant a sepaiatc key. These Jesuits would seem to point to the conclu¬ 
sion that some occupations can be broken down into speciali/ed sub¬ 
groups on the basis of interests, and that others cannot. To the first 
category one might add teachers and public administrators, alreadv dis¬ 
cussed in another connection, and certain types ol sales woik; to the 
latter, sales manage)s and salesmen in certain fields such as life insurance 
and vac iium-c leaneis. It would be interesting to know the lads for 
criminal and corporation lawyers; surgeons, pediatricians, and psvehia- 
tiists [surgeons do not differ bom physicians (775:697)]; and clinical, 
industrial, dillerential and physiological psychologists. A study carried 
out under Paterson’s supervision at the University of Minnesota deals 
with this last occupation. In view of the evidence accumulated by Strong 
it serins sale, in the meantime, to state that when compared with the 
interests ol men-in-general, the interests of men in a broadly defined 
occupation are so similar as to obscure differences between specialties. 
A mechanical engineer, when compared to non-engineers, is more like 
unto than different from a civil or electrical engineer, for then the com¬ 
mon factor, engineering, is crucial. 

It has been shown, however, that when a different point of reference 
is used, men in specialties within an occupation can be differentiated. 
Using engineering students as subjects, Estes and Horn (291) compared 
the interests of each type of engineer, mechanical for example, with those 
ol all other types of engineers studied. The point of reference in this 
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study was therefore not men-in-general, but engineers-in-general. Under 
these conditions the differences between the interests of the vaiious 
specialty groups became visible, and separate scales could be developed. 
Strong recognized the possibilities ol this approach (775:120), but has 
not attempted to capitalize them, nor has anyone else. The method is 
one which might well commend itscll to professional schools interested 
in providing better guidance services for their students or in improving 
their selection procedures. 

The point of reference used in constructing occupational scoring keys 
for interest inventories has been found to have one other impoitant 
effect on our knowledge of occupational differences in inteiests. The first 
men-in-general gioup used by Strong consisted, for leasons ol lonvcn- 
ience, of men for whom test data were on hand and who were not in the 
occupation under in\estigation (775:555). This happened to be an eco¬ 
nomic alh somewhat select gioup, for the fust scales constructed were for 
occupations which weie ol either piolessional or manage) ial calibie. 
When Stiong’s Vocational Inteiest Blank was used with men horn the 
lower half of the occupational heirauhy, little dilleientiation ol inteiests 
was found: ptinteis. caipenters, policemen, and lamuis, coming horn the 
skilled trades le\el, had so much in common, when compaied with a 
professional-manager ial-clei ical mcn-in-gcncral gioup, that the dillri- 
ences between them weie not \er\ significant (the inteic01 lelations 
approximate .70). and persons habitually employed at the semiskilled 
levels seemed to be undiflerentiated on the basis ol their inteiests (8j). 

These findings led the stall ol the Minnesota Employment Stabilization 
Research Institute to hypothesize concerning the dillerentiation ol semi¬ 
skilled and unskilled woikeis on the basis of inteiests (<S j), and prompted 
Strong to puisne a line ol lesearch ahead) suggested by his work with 
the women’s blank (775:55.}). Me theteloie developed occupational scales 
based on three dilfeienl points ol leference. These consisted ol (1) busi¬ 
ness and piolessional men earning $2500 or inoie pci annum (lather like 
the original scales), (2) a proportional sample of all occupational levels 
averaging, like the geneial population, at the skilled trade level, and 
(3) a proportional sample of skilled, semiskilled, and unskilled woikeis, 
averaging at the semiskilled level. For convenience these three reference 
points, called P T , P«, and P 3 In- Strong, may be leferred to as the white - 
collar, general, and blue-denim groups. Three hypotheses were set up 
to be tested by means ol these occupationally similar, hut rclercntiallv 
different, keys. These were 
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1. Certain occupations at diileient levels have the same types of in¬ 
terests (e.g., engineers at the professional and mechanics at the 
skilled); 

2. T he rank and file cannot be differentiated by their interests (e.g., 
semiskilled wo/keis lend to make no high scores); 

Men in the lower-level occupations have their own o(cupationally- 
speciali/ed interests (e.g., when compared to other semiskilled work¬ 
ers, drill-press operators have interests which are different from 
those of electrical-unit assembly woikeis). 

Using scales based on the general point ol reference (I\g Strong found 
that the conelation between the printri and carpenter scales was .2). 
wheieas it was when the white-collar point of reference (Pi) was used, 
' milarly, the conelation between printer and policeman was —.27 
instead of .59. In other words, when an appropriate point of reference 
n used, diffeiencc's between the interests of men in a given occupation 
and those ol men in the reference group appear significant; when an 
mappiopi iatr leleience point is used, differences between the interests 
ol men in the occupation being studied and those of “nien-in-gcneial” 
ate obsc uird. This holds whether it is men in a low-level occupation 
being studied against a high-level men-in-gcneral group, or men in a 
high-level occupation being studied with a low level mcn-in-gencral 
group as point ol jefcirncr. Miong’s thiid hypothesis was thcrefoie con¬ 
firmed, and it is to be expected that in due couise occupational inteicst 
scales will be* dew eloped which will be useful with men of less than 
average socio-economic level and for counseling and selection for mote 
occupations at the skilled and semiskilled IcveTs, many of winch should 
be found to have' clifieirnti.iting patterns of interests. 

Lest it appear horn the pi ending patagtaphs that all of our knowl¬ 
edge concerning the eliIiea carnation cd occupational gioups on the basis 
of inteiests is based on work with Strong’s inventorv, it should perhaps 
he* mentioned that kudei (j {(>) and Triggs (in an unpublished papeO 
have' eonfnmed Strong’s general findings for some lortv occupations with 
Kuder’s lhefere nee Record. Triggs went so far as to establish diilerential 
interest patterns for vat ions types of nurses, including supervisors and 
public-health nurses. The Allport-Yernon Study of Values has shown 
similar tiends with pre-otcupation.il groups in colleges (216), but has not 
been much used with men and women actually engaged in occupations. 
Reference has been largely to work with Strong’s Blank simph because. 
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as an older and more thorough 1 y studied vocational intrust inventory, 
it provides more data from which to draw conclusions. 

Socio-Economit Differences. The preceding discussion ol the effect 
of the point ol reference on the identifiahility of patterns of inteiests 
was virtually a tieatment of lesearch which has been carried out on 
socio-economic differences in occupational interests, at least in so far 
as methodology is concerned. There still remains the' task, however, ol 
describing the dillerences in interests which chaiacteri/e the* various 
occupational levels. The relevant work is lepoited in Strong’s book 
(77 r >*- to), in connection with his scale lot measming occupational 
interest level, or the socio-economic level at which an individual would 
be placed on the basis of similaritv of inteiests. Men who are successlully 
cmploved in the higliei level occupations tend to have moic inteiest in 
literarv and legal activities and in business contact vvoik, and less social 
welfare and sub-professional technical interest, than men in lower level 
occupations. Men in legal and literal v occupations, salesmen, and sci¬ 
entists lend to make high occupational level scores, although there is no 
1 elationship between the scientific and occupational level scales on 
Stiong's Hlank. Senior public administiatois sc01 e highci on occupational 
level than do junior public administrate)! s (77c)). Stiong suggests that 
the scale measuies managerial ability. 

O11 the other hand, it has bee n more plausibly suggested bv Dai lev 
(18c):(k) and (>(>) that occupational interest level is indicative of aspiiation 
Jewel, that it “lepiesents the de-gree to which the individual's total back 
giound has prepaied him to seek the* piestige and discharge the social 
1 cspon.sihiJif ies glowing out of high income, piofcssional status, and 
recognition or leadership in the community, at the lower end ol the- 
scale, the individual's backgiound lias piepared him for the anonymitv. 
the mundane round ol activities and the* lolloweishij) status ol a great 
majority ol the pojiulation.” He 1 suggests, also, that those* who aie chai- 
acteri/ed by a low level ol occupational inteiest aie likely to lack tile* 
motivation which results in staving power in college. Kendall (jul*) has 
attempted to validate tin's hypothesis with tluee gioups ol 100 me n each 
at Syracuse University, selected horn the entering lieshman class on the* 
basis ol high, average, and low occupational level scenes on Si long’s 
Blank. I hose three diflcting occupational level gioups weie lound to 
differ also in mental ability as measmed by the Ohio State Psychological 
Examination. Those who were* high on these measuies made 1 higher 
hour-point tatios during the fust semester. W hen intelligence was held 
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constant the academic achievement of the three occupational level groups 
was again found to difler, the diflerences being significant at between the 
one percent and the r, percent levels. 'The differences are therefore not 
completely clear cut, hut they do suggest that those with extremely low 
occupational interest levels are likely to find college work foreign to their 
taste, whereas it will he congenial to those who arc chat actei i/ed by high 
occupational interest levels. 

A <•()(ational Difjn r?K cs. What has been found true erf occupations 
has been found to apply also to a\creations. In a study of model engi¬ 
neers, amateur photogiaphers, amateur musicians, and stamp collectors, 
the writer (791) found that men who were active in the first three avoca¬ 
tions had patterns ol interests which differentiated them from each other, 
and that the first two interest patterns resembled each otfier (r = .58) 
whereas tlu 1 fust and thiul had nothing in common (r = .02). Although 
the* number ol avocations studied is small, this suggests that they are 
differentiated and max he* classified in wavs similar to occupations. The 
interest patterns ol stamp collectors were found to he similar to those oi 
other groups ol men, suggesting that philatelv is an avocation which, 
like the vocation of executive, cuts across basic interest patterns which 
.1 ] e much mote impoitant than the* interest common to men engaging in 
it. It is notcwoi thv, also, that the* three dillcrcntiablc avcxational inte'est 
patterns resemble those ol the 1 expected occupations, e.g., the model 
engineers have interests like those of professional engineers, whereas the 
interests of stamp toller tors ait* difficult to classify vo< ationallv. 

SV'v 1 )//}rt rut r s. Popular stereotv pes as to the masculinity and femin¬ 
ity of interests ate widespread, and it is natural to ask what research in 
the psvchologv ol interests lounel in this area. Studies made by Terman 
and Miles (820), Cat ter and Strong (i.jS). Yum (992), Strong (77y 
Ch. 11), Kutltr (|jh:2g). and Ttaxlcr and McCall (8(>S). All agree that 
men tend to he more interested in phvsical aetivitv, mechanical and scien¬ 
tific matters, politics, and selling. Interest in ait. music, literature, people, 
clerical woik, teaching, and social work is more characteristic of women. 
It is especially wot thv ol note that maseulinitv and feminity are scaled 
traits rather than dichotomies: people are not masculine or feminine in 
their interests, hut more or less masculine or feminine. Some men are very 
masculine, and so ate* some, hut fewer, women; some women are vet \ 
feminine, and so are some, hut fewer, men. It is interesting to speculate 
as to whether the higher incidence of cultural (artistic, literary, musical, 
and social) interests in women means that they ate constitutionally the 
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carriers of culture, or whether they have simply taken on that role be¬ 
cause nature forced men, as the stouter animals, to take on the competi¬ 
tive, constructional, and provisioning roles. Anthtopological studies 
suggest the latter, since there are a lew societies in which men ate the 
domestics and women the providets. But physical constitution seems to 
play a part, as shown by the preponderance of active-male societies. A 
good illustration is Miles’ (529) case study of a boy raised for 1 7 years 
as a girl: despite the seemingly overwhelming feminine influences to 
which he was subjected, he made definitely masculine scores on the 
Terman-Miles Masculinity-Feminity Test and on Strong’s Vocational 
Interest Biank (scored for masculinity-leminitv of interests). 

Age Differences. Counselors and psychologists who have not carelully 
studied the literature on change of interest with age frequently question 
the wisdom of giving much weight to measures of inteicsis because* of 
the possibility of change of interest with age. This question ovetlaps to 
some extent with that of the peimancnce ot interests, discussed below, 
but it is distinct in that it focuses on the relationship between age and 
change, rather than on the effects of experience. 

Thiee important studies ha\e been made of the chUerences in interests 
which are associated with differences in age. The first of these was 1 >\ 
Strong (771), inc01 porated and brought up to date in his latet book 
(775: Ch. 12-r;): the second was a seiies of follow-up studies b\ Strong 
( 77 r> :i: die third was pait ol the Adolescent Gtowth Studs of the 
University of California, written up in a series of articles 1 >\ (latter and 
others and summarized in his monograph (145) and in a journal article* 

Strong’s first approach consisted of comparing the inteicsis ol men 
at ages 13, 25, 35, and 55, both by anahsis ol indi\idual items (ages 15. 

2 5, and 55) and by the construction of interest-matin it y scales for each of 
the font age levels selected lor stuch. These analyses nwealed that age 
differences ate less significant than occupational dilletcnees. The interests 
of jyyeai-olds agree in large measure with those of 25-ycai-olds (r = .57). 
are more like those of gyy ear -olds (r = .fib), and even more like those ol 
55-year-olds (r — .Hcj) (775:27c)); as about one-third ot the change that 
takes place between ages 15 and 25 occurs during the hist year (15.5 to 
16.5), one-third during the next two years, and one-thiid during the 
next seven years (775:25c)) it is clear that interests are tail 1 y well crystal 
lized by age 18. Boy’s interests tend to become less like those of physi¬ 
cians. dentists, and engineers as rhc\ approach age* 25, and more like 
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those of office workers, salesmen, accountants, physical directors, social 
science teachers, and personnel managers; those whose interest-maturity 
scores on Strong’s Blank are high are least likely to show changes of 
interest patterns, whereas the interests of those whose interest-maturity 
scores are low are most likely to undergo change. 

The slight changes that take place after age 25 tend to be an undoing 
of those that took place prior to that time, as shown by the higher 
correlations betwe en the interests ol 15 and 35 or 55-year-olds than those 
of 15 and 25-)ear-olds already cited. Strong has confirmed this with two 
diflereut sets ol data (775:285-285). A study by Sollenbetgei (726) pro¬ 
vides a basis lor the conclusion that increases in hormone activity in 
adolescence account lor the changes that take place in boys at that stage; 
perhaps it is decreases in hormone activity after the mid-twenties [sug¬ 
gested by studies of sex habits conducted bv Kinsey (124)] which account 
for the reversal. This tendency toward an undoing of the 15 to 25-year- 
old changes should not, however, be interpreted as a reversal of all 
trends, for the decreased interest in physical activity and daring con¬ 
tinues beyond age 25 and is the 1 most striking change during that period 
of little change; otheis are a decreased interest in occupations involving 
writing, and a lessened liking lor change or interlercnce with established 
habits. Strong summarizes 1 1 is work as follows: ‘’T he primary conclusion 
regarding interests ol men between 25 and 55 years of age is that they 
change* \er\ little. When these slight differences over thirty years arc 
contrasted with the differences to be found among occupational groups, 
or between men and women, or between unskilled and professional men, 
it must be realized that age. and the experience that goes with age, 
change an adult man’s interests very little. At 25 years of age he is 
largely what he is going to be and even at 20 years of age he has accpiired 
pretty much the* interests he* will have throughout lile” (775:313). 

'The second scries ol studies conducted by Strong were follow-ups of 
175 Stanford freshmen, retested nine years later, and of 168 Stanford 
seniors, retested ten years after graduation. The average correlation 
between test and retest scores was .56 for those first tested as freshmen 
(ages 18 and 27) and .71 tor those first tested as seniors (ages 21 and 31). 
These findings from longitudinal studies confirm the conclusions drawn 
from Strong’s cross-sectional analyses in revealing a fair degree of perma¬ 
nence of interests in 18-vear-olds. and a substantial degree in 21-vear-olds. 
The lowest retest reliabilities at these ages were in the social welfare 
occupations, and the highest in the scientific and litcrarv occupations; 
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that this is partly due to decided increases in social welfare scoics and 
relative stability in literary scores is shown by critical ratios ranging 
from s.r y to 5.1 for the test retest means of the former, and by critical 
ratios of — o.G and —1.5 for those of the latter (775;^(>3). As those for the 
test-retest means of the scientific occupations ranged from o.f> to .j.l\ 
showing a tendency lor some increase in scoics to take place there, it 
must be deduced that the changes in scientific scores ate regular and do 
not generally aflect the rank older of the persons tested, while the 
changes taking place in social welfare interests are irregular and do 
generally aflect the rank order of the persons tested. In other woids, 
those who make the highest scientific scores tend to lcmain highest, and 
those who make the lowest scientific scores as seniors tend to remain 
lowest, while some of those who made the lowest social-welfare scores 
make substantial gains in this area and others do not. Stiong’s other 
findings show that it is those persons with the lowest interest maturity 
scores who make the most radical gains. 

The Adolescent Growth Study investigations were, as the title* implies, 
longitudinal studies of high school pupils who were tested with Stiong’s 
Blank in the 10th or 11th grades and retested each veai until gradua¬ 
tion from high school at about age iN and, in some* case's, until alter 
graduation from college. These studies showed that the correlations be¬ 
tween interest patterns in roth or 11th grade and the last Near of college 
are about as high as those between interests in the first Near of college* 
and fi\e years after graduation, for Taylor’s studv (Nig) rescaled a mean 
correlation of .r,2 for 11th graders retested six seals later, as compared 
with Strong’s average correlation of .rfi for college* freshmen retested 
nine sears later. Carter (1 n) and Taylor and Carter (Ni j) base similarls 
demonstrated that the interest patterns of high school boss and gills 
(in practically the only studies of change in girls) remain fairly stable 
throughout the high school years. Carter concluded that “the Strong 
scales are almost, but not quite, as reliable and stable whe n used at the 
high school level as when used with adults,” and 1 avlor stated that 
“vocational interests, as measured bv the Strong invcntoi it s, appeal to 
be almost as permanent during the high school years as during adult 
life.” it may be well at this point, however, to remember Strong’s more 
cautious conclusions, already quoted on page geji. 

Comrnunality of Interests. lire significance of diflerences in the* 
interests of occupational, socio-economic, a vocational, sex, and age groups 
obscures an important fact brought out by Strong’s research (775: Ch. (>), 
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namely, the fact that people's interests are lar mote similar than different, 
regardless of sex, age, or occupational status. Jt is not really surprising 
to learn that people are human, and yet the fact is easily lost sight of 
when they are studied as men and women, boys and men, or professional 
men and skilled workers. The likes of college men and women are vei y 
similar (r = .7l), those of i 5-veai‘-old-boys and 55-year-old men are no 
less similar (r — .75), and those ol unskilled workers and professional- 
managerial men resemble each other even more closely (r = ,8j). Under¬ 
neath the* very real dilierences among various groups of people we find 
an even larger common core which is of great social and philosophical 
irnpor tanc e. 

Stability of Inf nests. 1 lie question of the permanence of interests 
is closely tied up with that ol change of interests associated with age. 
We have seen that age changes do take plate in adolescence, but that 
the- patterns ol interests which begin to manifest themselves by age 15 
tend to be those which are revealed at ages 25, 35, and 55. Most of the 
change- which does take place- with maturity is complete by age 18; the 
tvpe ol change which mav take place at that age is systematic and pre¬ 
dictable on the bads ol interest inventory data (interest-maturity scores). 
It is still pertinent, however, to inquire concerning the permanence ol 
interests when tliev aie subjected to influences which may change them 
in one* direction or anothei. Kitson (J30), for example, has described a 
series ol projects designed bv O’Rourke to modify interest in vocational 
activities. The evaluation in terms of changes in expressed interests 
showed that pleasant experiences do change overt attitudes toward 
activities. Hut whethei or not underiving interests, or interest patterns 
as measured bv St long’s Blank, are thereby modified remains to be 
ascc-r tamed. 

'I here has been surprisingly little study of this problem, insofar as 
inventoried interests are concerned; the focus has generally been on the 
effect of lather limited experiences on expressed preferences. However, 
Hut nham (1*25), (,Inss (775:379b Mathei (775:379b Hwan and Johnson 
(liiio). Kingman (135b Strong (775:388-^11), and Van Dusen (889) have 
investigated the effects of school and vocational experiences on inven¬ 
toried inter ests. 

The relationship of change of inventoried interests to college grades 
among Yale students was studied bv Burnham, who found no relation¬ 
ship; such changes in interests as did lake place could not demonstrable 
be attributed to the kind ol grade achieved in college courses, klugnian s 
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contrary findings concerning the clerical interests of high school girls 
probably prove little, in view of the general tendency for girls and women 
to make high clerical scores. Van Duscn worked with engineering stu¬ 
dents at the University of Florida, a group whose mean scores, as Strong 
points out (775:278), were very low, suggesting that they may have been, 
not a selected group, but rather a heterogeneous collection of state 
university freshmen who thought they would like to study engineering. 
He found a slight and statistically insignificant decrease in the retest 
scores of students who had given up their freshman choice by their senior 
sear, and similar increases in the engineering scores of students who re¬ 
mained in that field throughout college. Strong failed to find the last 
trend in Stanford engineering students, but confirmed the others in 
studying the occupational histories and lest scores of his Stanford seniors 
who were followed up ten yeais later: those who were finallv employed 
in a field other than that preferred when they were seniors made retest 
scores which were 5.6 standard score points lower than their original 
scores in the latter field: the critical ratio approached significance (2.5). 
The retest scores on the finalh-entered occupation were* higher by a 
comparable amount than were the original scores for that occupation. 
It is significant that there were no changes in the scores of those who 
entered and remained in the field of their preference' as seniors: ten years 
of occupational experience did not increase interest in the held of em¬ 
ployment. Strong also analyzed the emplouncnt histories and test scores 
of Stanford heshmen retested nine years later, and found essentially the 
same results. 

Mather, as reported in Strong, found no increase in home economics 
teacher scores after practice teaching in that field (a limited sample ol 
experience, and a group already somewhat selected by uaining); she* did. 
howc\cr find substantial increases (j.7 standaid sc01c points, or one-hall 
sigma) in the appropriate intciests of .J5 students who were retested aftei 
their first two yeius of exposure to the field of home* economics. These 
studies suggest, either that experience in a field i nappy opnatr to one’s 
interests causes one to become even less interested in that field (and. 
conversely, more interested in appropriate occupations), or that it helps 
to bring about a better understanding ol one’s likes and dislikes and 
obtain more nearly true scores on an interest imentor). In the case ol 
appropriate experience, however, there seems to he no effect, perhaps 
because understanding is already good enough not to he affected. As 
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the inventory is a self-portrait technique, the second explanation seems 
acceptable. 

Not in keeping with this interpretation are Glass’ results, which 
showed that the interests of unselected engineering freshmen who re¬ 
mained in engineeiing college until graduation became less like those 
of engineers (shilt from B + average to B) while the interests of those 
who dropped out as freshmen were interpreted as having become some¬ 
what more like those of engineers. In the latter instance, however, it was 
an insignificant raw score increase of two points, but one which happened 
to change the mean letter grade from B to B + . The decline in the inter¬ 
est of the graduates may have been due to poor guidance and selection, 
such as frequently results in many able but uninterested students per¬ 
sisting until graduation: thus many graduate engineers never enter 
engineering occupations, but become salesmen, accountants, etc. 

Two studies have considered the relationship between length of time 
m an occupation and similarity of interests to those of men successful 
m that field. Both of these d>bo,775:487) found insignificant correlations 
boo to —,12) between the interest scores and length of experience of sales 
and service men in the one case, and of life insurance salesmen in the 
other. 

It is peihaps not so difficult to synthesize these findings into a theory 
of the effect of adolescent and adult experiences on vocational interests 
as their occasional apparent discrepancies suggest. Strong (775:380) con¬ 
cludes that “the interests of occupational groups are present to a large 
degiee prior to entrance into the occupation and so arc presumably a 
factoi in the selec tion of the occupation,” rather, the implication being, 
than the result ol experience in that occupation. This conclusion is 
legitimate and adequate enough as a generalization concerning the per¬ 
manence of intere.sts. but it docs not go as far as the data warrant in 
describing the modification, as opposed to the creation or destruction, 
of interests by experience. Before citing the attempts of others to provide 
such a synthesis and interpretation, however, three more aspects of the 
problem of the origin and development of vocational interests need to 
be dealt with. These are family resemblances, and the roles of aptitudes 
and of personality factors. 

Family Resemblances. The inventoried vocational interests of 110 
pairs of fathers and sons were correlated by Strong (775:680), the sons 
ranging in age fiom 15 to 28 with a mean of 22 years. The range ol 
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correlations for 22 vocational interest scales was from . n to ..]8, the 
average intercorrclation being .29; the average intetcorrelation lor ran¬ 
domly assorted men and boys, from the same total group, was .03. T he 
interests of 125 pairs of fathers and sons were studied by Forster in a 
thesis cited by Berdie (79:1.15); in this study the sons were all students at 
the Ihiivcrsitv of Minnesota. The range of intercorrelations for 25 occu¬ 
pational interest scales was Irom .00 to .48, with an average of .33. Berdie 
(77) found that the sons of men in the skilled trades and in business 
tended to ha\e imentoriecl interests in those fields, although the rela¬ 
tionship did not hold lor other fields. The reason may lie in tlu* fact 
that, as shown in a study of the writer's (790), these two occupational 
fields are near the top of the blue-denim and white-collar occupational 
ladders, making it socially acceptable lor sons of business men and skilled 
woiheis to aspiie to emulate and identify with their lathers, but less 
easy for the sons ol unskilled, semiskilled, or clerical woi kers, v\ ho aie 
not at the top of either ladder, to do so. It would be difficult to explain 
the lack ol relationship among the intetests of professional men and tin ir 
sons in Berdie’s study in teims ol this hypothesis, since thev are alreaih 
high on the white-collar ladder, were* it not lot positive lesults which Ik* 
reports from a studv by Dvoiak. She found that the intet ests ol plush tans 
and their sons were similar. This suggests that sampling et rots mav have* 
affected Bet die's results for this one occupational level, in which case 
the hypothesis that family resemblances ate most likely to be found at 
the levels which are considered near the top ol a social ladder would be 
confirmed. 

Other family relationships studied are those of twins, both identical 
and fraternal, in a teport by Cat ter (1 j2). His subjects were 120 pa ii s ol 
twins, p; ol the pairs being monozygotic. For these latter the a vet age* 
correlation was .50, whereas that tor di/ygotic twins was .28. C/attet, 
Strong, and others have argued that the closet tesemblance ol the inlet- 
ests ol identical than ol Itaternal twins does not ptove that heredity plavs 
a part, lor “the environments of identical twins ate mote similar than 
those of fraternal twins” (1 15 :1 )- I bis is an oft repeated statement, but 
one which has not, to this wi iter’s knowledge*, ever be en demonsti ated. 
It is even more logical to maintain that the environments of lraieinal 
twins are mote similar than those of lathers and sons, in view of the 
differences in age, generation, and daily routines in the latter case; but 
we have seen that the interests of fatheis and sons lescrnble cadi other 
just as closely as do those of /internal hems (1 29 or and .28 respec - 
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lively). It seems necessary, then, tentatively to conclude that the greater 
similarity ol the interests of identical twins, as contrasted with those of 
fiaternal twins, is not due to the potentially greater similarity ol their 
environments, but. rather to the demonstrably greater similarity o f their 
heredit ies. 

I his viewpoint is that espoused by Strong (775:682), who points out 
that if environment is so predominantly important it is odd that boys 
and girls learn difierent interests by the time they are 15 years ol age and 
unlearn them so little therealter (sec the discussion of sex diflerences), 
and that ot (upation-like differences which are lound in the interests of 
adolescents aie aliened so little by subsequent training and experience. 

Aptitude as a Sonne of Interest. The necessary conclusion, as Strong 
sees it, is that “interests reflect inborn abilities” (775:882). There is little 
evidence, howevei, by means of which this inductive hypothesis can be 
verified or rejected ft has been demonstrated that there is some relation¬ 
ship between intelligence and inventoried interests: Strong (775:332-333) 
has summari/ed the various studies, showing that the correlations range 
horn about —. jo to . jo depending upon the type of interest. The posi¬ 
tive conelations ate with scientific and linguistic interests, while the 
negative iclationships ate with social welfare, business contact, and busi¬ 
ness detail inteiests. Re aders who happen to be social workers or teachers 
need take no offense at the first of these negative iclationships, winch 
shows that no noimaJ person is too dull to take an interest in his fellow 
men, and that theie is a tendency for mentally superior people to let 
themselves become absorbed, perhaps to too great an extent, in other 
matters. As scientific and linguistic occupations deal primarily with ab- 
stiac tions, and soc ial welfare and business occupations at least partlv with 
tangibles, what these relationships demonstrate is that, without the 
ability to understand, there can be little genuine interest. 

There have been fewer studies of the relationship between special 
aptitudes and inventoried inteiests. Adkins and Ruder (8) correlated 
semes on the Primaly Mental Abilities Tests with those on the Ruder 
Preference Record, and found that onlv one correlation was above 30: 
that between iiumbei abilitv and computational interest in women. 
Although this one relationship seems logical, it did not hold for men, 
and other ecjuallv appropriate relationships were not found in a high 
enough degree to justify am positive conclusions concerning the rela¬ 
tionship of aptitude to interest. However. Dailey (191) found somewhat 
clearer indications of relationships between PM A Test scores and six 
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representative Strong scales (r’s ranged from — .04 to .31), and Long (478) 
found the expected relationships with the Stanford Scientific; Aptitude 
Test. Other comparable data were reported in a thesis by Leflel (460), 
who found positive relationships (r == .46 and .42) between the O'Rourke 
Mechanical Aptitude Test and engineering and chemical interests on 
Strong’s Vocational Interest Blank, and negative relationships between 
O’Rourke and Strong’s social science teacher and lawyer scales (—.25 and 
— .25) and by Holcomb and Laslett (375), who obtained similar findings 
with Stenquist’s mechanical (paper-and-penc il) test. As the O’Rourke is 
to an indeterminate degree a measure of information, and thereloie oi 
interest as well as of aptitude, it would be dillicult to draw any pertinent 
conclusions from Leffel’s findings were it not that Holcomb and Laslett 
(375) and Moore (536) cite comparable data for the MacQuarrie and the 
Bennett Mechanical Comprehension Test. It seems, then, that there is 
some relationship between aptitudes and interests. 

As so little research has been carried out to test Strong’s hypothesis 
concerning the relationship of aptitudes to interests, it may be well to 
reproduce his reasoning on this point. "An interest is an expression of 
one’s reaction to his environment. 'The reaction of liking-disliking is a 
resultant of satisfactory or unsatisfactory dealing with the object. I)if 
ferent people react differently tea the same object. 'The different reactions, 
we suspect, arise because the individuals are difleient to star t with. We 
suspect that people who have the kind of brain that handles mathematics 
easily will like such activities and vice versa. In other words, interests 
are related to abilities and abilities, it is easv to see, can be inherited. 
There is, however, a pathetic lack of data ter substantiate all this” (775: 
682-683). Strong believes that there are two reasons for this: interests 
must reflect the environment, and they are evaluated by the environment. 
Whereas a primitive Indian boy with fine finger dexterity might make 
arrowheads, the concomitant satisfaction of finger dexterity might make 
an urban American boy aspire to the occupation of dentist or watch 
repairman. Interpreting this aspiration in terms of socio-economic levels, 
the professional man’s son might want to be a dentist, the son of a 
skilled tradesman might want to be a watchmaker. Establishing a causal 
relationship between aptitude and interest is dillicult under these cir¬ 
cumstances. 

A conclusion diametrically opposed to Strong’s was reached by Berdie 
after reviewing a number of studies of the relationship between ability 
and vocational interests, most of which actually dealt with choices or 
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expressed preferences lather than with inventoried intciesLs (79: 142). 
He wrote: “The available evidence indicates, however, that a person’s 
ability is not a very important factor in determining his interests, and 
although a relationship can be found between the two factors, this rela¬ 
tionship is so small that we must look further if we arc to understand the 
sources of vocational interests.” 

With so little evidence on which to base conclusions, it seems likeh 
that this disagreement is one of orientation rather than of interpretation: 
each writer sees essentially the same facts, but, as the situation is not 
clearly structured, they are differently interpreted. Berdic’s orientation 
seems to be similar to that of Darlcy (189: Ch. 6), who rejected Strong’s 
deductions because “in general the magnitude of such correlations is 
too low to substantiate the hypothesis,” and because differing amounts 
and kinds of aptitudes “might be required” for success in architecture 
and in chemistry, both of which belong to the same interest family. Con¬ 
cerning Harley’s first objection, we have just seen that there is insufficient 
evidence, and that it assigns a real role to intelligence. Concerning his 
second objection, one can only point out that it is hypothetical and that 
it would he quite logical for two partly cnerlapping complexes ol apti¬ 
tudes to contain some differing factors, thus resulting in two related but 
not identical constellations of interests such as architecture and chem¬ 
istry, which be long in the same occupational interest group and which 
both require number and spatial aptitudes. The different channeling 
in architecture and chemistry could he due tea other aptitudes, social 
approval, or personality factois. Harley’s objections to Strong’s hypothc 
sis therefore do not seem \ri\ compelling. Be this as it may, he thought 
it necessary to change the point of departure. After also rejecting a form 
ol recapitulation theon, because of general lack of substantiation of 
such theories, lw* wiites: “The adjecti\es bv which our behavior is char¬ 
acterized in the description of others have usually been applied as per¬ 
sonality values or attributes in late adolescence or young adulthood. Our 
occupational stereotypes of the ‘typical salesman’ or the ‘meek book¬ 
keeper’ or the ‘absent-minded professor’ evoke a series of such adjecthes 
when we attempt to define the stereotype. It is possible then that occu¬ 
pational selection and elimination is based on personality type as well 
as amounts and kinds of ability and aptitude. The third hypothesis of 
the origin of occupational interest types is that they are by-products of 
the development of personality types” (189:56). Darlcy goes on to cite 
evidence which he believes substantiates his hypothesis, quoting Carter 
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(i.pj) to the same effect. Such evidence is reviewed in the paragraphs 

which follow. 

Personality and Interests. Social attitudes are the least fixed ol per¬ 
sonality traits in the sense of being most clearly and readily affected by 
the environment. In a preliminary study ol the relationship between 
these and vocational interests during the Depression, Darley (188) lound 
that students with interests like those of personnel managers and YMCA 
secretaries had the highest morale, and those with intciests like those 
of engineers and chemists were lowest on moiale, as these ate respec lively 
measured by Strong’s Blank and the Minnesota Scale lor the Sui\cy ol 
Opinions. Such results in a preliminary study led to an analysis ol data 
from 1000 cases tested at the University ol Minnesota (i8c):(b;--f>r,)- This 
revealed that, contrary to the findings of the preliminary study', thete 
was no relationship between morale store's and t\pe ol intciests. On the 
other hand, differentes in liberalism and social adjustment weie lound: 
those with welfaic intciests were most libeial, those with business in¬ 
terests least so; those with social wellare and business contact intciests 
were best adjusted socially, those with linguistic and technical interests 
least so. 

1/ values are thought of as representing a layer ol personality which is 
deeper than those* at which vocational interests and social attitude’s aie 
found, then it is significant that there is also some relationship between 
these two types of interests. Sarbin and Berdie (liliS) obtained St long 
and Allpoit-Vernon scores from 52 college students, and lound positive 
relationships between scientific interests and thcoietical values, welfare* 
intc iests and religious values. Dully and (a issy (217) obtained similar 
data from 108 college women, reporting inteiconcJafion.s which were in 
the expected directions and generally in the .30’s. Buigemeister (12.J) 
conliimed these findings with another gioup ol i(ij college women, ic- 
porting that the intciests ol librarians, artists, and authois, lor example, 
tend to be associated with aesthetic values, and that those ol physicians 
and science teacheis tend to go with thcoietical values. Feiguson, Hum¬ 
phreys, and F. \Y. Strong (253) have also confirmed these trends, with 
93 college men. 

Personality traits at a somewhat deeper level were also included in 
Darky’s investigation (1 Bydig). These were measured by means of the 
Bell Adjustment Inventory and the Minnesota Scale lor the Survey ol 
Opinions, the former yielding scores for home and emotional adjustment 
and the latter for feelings ol inferiority and family adjustment which 
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,ur of interest to us here. Dailey reports that home and emotional ad¬ 
justment wcie not iclated to any occupational interest patterns; inferi¬ 
ority feelings were somewhat less common in those with welfare interests 
than those with a technical or no primary interest patterns, and family 
attitudes were somewhat better in men with business detail interests than 
in those with linguistic, or no primary interest patterns, but neither 
inferiority feeling nor family attitudes differentiated between other 
interest groups. Per die (77) used the Minnesota Personality Scale and 
Strong’s Inventory, and found that high school seniors with interests 
like those of engineers had inferior social adjustment, whereas those 
with social welfare interests were better adjusted socially and emotionally. 
In the onh other stuch of this type known to the writer, Alteneder (i.j) 
found no correlations a\ hich exceeded .25 between men’s adjustment 
and interest scores for six occupations, and only four which exceeded 
that lor seven women's occupations. These latter were .yj and .38 be¬ 
tween social adjustment (Bell) on the one hand and linguistic and social 
work interests (Strong) on the other, and a; | and .2f> between emotional 
and home* adjustment on the one hand and teaching interest cm the 
other. Although the results of Alteneder’s women’s stuch are intriguing, 
the* lack of positive results for the men’s occupations makes them merclv 
suggest i\ e. 

A still deeper level of personality organization was studied by Triggs 
(87*;). who correlated cyclical, paranoid, schi/oid, and other temperament 
11 aits as measured 1>\ the* Minnesota Multiphasic Personality Inventory 
with vocational interests measured by the Ruder Preference Record. 
Significant relationships reported in that paper were, for 35 college men, 
those between depression and social service (r = —.34 and clerical (.38) 
inte rests, psvc hopathic deviation and mechanical interests (— ..ji); femi¬ 
ninity and mechanical interests (—.37): paranoia and computational 
(—. 12) and scientific ( —ajS) interests; psvcha.sthenia and scientific (— epp, 
musical (.gg). and clerical (ep;) interests; and schi/oid trends and musical 
(.gq) and cleiical (.*;2) interests. It is perhaps worth noting that these 
1 elatioirships are suggestive of more positive personality adjustments 
being found with mechanical, computational, scientific, and social service 
interests, and of more maladjustments being associated with musical 
(psychasthenic and schi/oid) and clerical (depressed, psychasthenic, and 
schi/oid) interests. These relationships are all significant at the 7, and 
occasionally almost at the 1, percent levels. When the same techniques 
were applied to women college students, fio in number, no relationships 
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were found except between lie score and musical and social service in¬ 
terests (the mean lie score for the whole group was normal). The appar¬ 
ent discrepancy between the sets of data for men and for women may be 
the result of the small size of the samples, which certainly require con¬ 
firmation with larger numbers, but it is also possible that certain voca¬ 
tional interests could have pathological significance in men and yet be 
quite wholesome in women. Triggs, at least, felt that the relationships 
which she found were significant. 

As Barley has pointed out, Terman and Miles’, and Strong’s data on 
masculinity and femininity of interests indicate a relationship between 
temperament and vocational interests, the endonine basis of which has 
been demonstrated by Sollenberger (726). Work with an information 
test designed to measure temperament factors through information and 
interest (316: Ch. 14,25; 925:68-74) tends furthei to substantiate the 
hypothesis that interests are related to temperamental factors. 

Origin and Development of Interests. The first published attempt 
to synthesize findings such as those reviewed abo\e into a theory of the 
development of vocational interests was made by Carter (ij.j), with a 
focus which is primarily envhonmentaf. As he sees it, the individual 
derives satisfaction from identification with some group, by which means 
he attains status. If his abilities permit, this identification is strengthened; 
if insurmountable obstacles are encountered, the 1 piocess of identification 
is interfered with, the self-concept is changed, identification with another 
group must take place, and with it a new pattern of interests is developed 
which is more compatible with the aptitudes of the person in question. 
Carter goes on to state that the interest patterns of adolescents tend to 
become increasingly practical, that in the beginning many adolescent 
interest patterns provide very unsatisfactory solutions of the problem oi 
adjusting their aspirations to personal abilities and social demands. lie 
writes (14 p 186): “In this process of trying to adjust to a complex cultmc. 
the individual finds experiences which oiler some basis for the integration 
of personality. The pattern of vocational interests which gradually forms 
becomes closely identified with the self . . . The pattern of interests is 
in the nature of a set of values which can find expression in one family of 
occupations but not in cithers.” 

This is essentially the line of thought developed independently by 
Darley, who quotes Carter in a briefer discussion of the same subject 
(189:57), and subscribed to by Berdie (79). T his writer sees three serious 
defects in it. 
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First, it is based partly on an cnvironmentalistic interpretation of 
Strong’s, Carter’s, and Berdie’s data on family resemblances, interpreta¬ 
tion which, as we have seen above, does not seem warranted when viewed 
objectively, however laudable it may be to believe in the essential 
modifiability and improvability of man. 

Secondly, although it takes into account the role of aptitudes, person¬ 
ality is postulated as the basic factor, modified by the interaction of apti¬ 
tudes and environment. But this, we have seen, is on the basis of evidence 
which is fragmentary, tentative, and not much more convincing than 
that on the role of aptitudes which caused Strong to postulate that apti¬ 
tudes are the fundamental factor. 

Thirdly, although Carter’s description ol the process of identification, 
tiial, disruption, and reshaping of the identification sounds convincing, 
there is no evidence for it in the intensive analyses which he and other 
members of the California Adolescent Growth Study have published, nor 
in the publications of Dailey, Beidie, and other Minnesota psychologists 
On the contrary, we have seen that everything that has been published 
on the development or stability of interests from the beginnings of ado¬ 
lescence on suggests that the form in which interest patterns begin to 
crvstalli/e is essentially the form in which they remain, except, as they are 
modified bv glandular changes associated with age. 

An explanation of these phenomena which attempts to take into ac 
count the stability ol inventoried interests has been advanced by Bordin 
(iri). As he puts it. “One of the major facts which Strong has established 
concerning his blank is the continuity of interest patterns. In general he 
Iras found that these patterns become more stable as the group studied is 
older. Reading in between the lines of the most discussions of the interest 
test phenomena, this fact is taken to mean that Strong interest patterns 
are fixed, once developed, and therefore any actual changes are clue to 
unreliability or other tvpes of error. But our theory can encompass the 
same phenomena without recourse to the catch-all concept of error. First 
of all, we assume that it would be acknowledged as a social-psychological 
and sociological fact that the older the individual is, the more likely it is 
that he will have established himself occupationally and the less likely it 
is that conditions will require a change in his occupation ... In an¬ 
swering a Strong Vocational Intelest test an individual is expressing fus 
acceptance of a particular view or concept of himself in terms of occupa- 
t onal stereotypes .” (i 11:5c) and 53). 

Bordin thereloie agrees with Garter in thinking of inventoried interests 
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as the reflection of a sell-concept, which is developed as a result ol the 
interplay between the endowments ol the individual and the environment 
in which he lives. He diflerentiales between Cartel's view, which he con¬ 
siders dynamic because ol this emphasis on interaction, and Dailey’s 
viewpoint, which he considers static' because ol its emphasis on the* mat¬ 
uration of personality traits and their biological basis. He c liarac tei i/es 
Strong’s viewpoint, which, like Dailey’s, he judged horn personal conver¬ 
sations supplementing their writings, as empirical and going no Imther 
than stating that there are interest patterns which diflerentiate men in 
one occupation bom tho»c .n others. Reconciling the data on the* stabdiiv 
of interests with Canter’s theory ot their dynamic natme b\ tin* use ol 
Carters interpretation ol interests as sell-concepts, he b\-passes the Inst 
two objections just raised by this wi iter to Caller’s, Dai lex’s, and lRr die’s 
xiewpomts. He states: “11 under pcisonalitx we include the spetilic long- 
and short-term goal-directed shixings ol tin* indixidual, then this xiew ol 
interest patterns mav be described as considering these patterns as In- 
products ol the* indixidual's personality Wc* must lccogni/c that these 
strivings are in a state ol flux, changing to meet the fluctuations in the 
situation.” (111:51). l»oidin goes on to se t up a set ies ol challenging 
hypotheses and coiollaiies which he beJicxcs leseaich will pio\e \ahd 
As most ol them ase still to he tested, then lemain in the* Halm ol hxpoth 
eses; in the writer s opinion, then seem sound, lie would, howcnei. incoi 
porate them in a conce pt ol inteiest which puts less c \c 1 usi\ c* emphasis 1 a 1 
personality and on en\ironment. lor the lacts which ha\c* h(*en unienud 
justify assigning important loles also to abilitx and to her edit \. 

As this intioduetoin section on the natme and elcnclopmcnt ol inn lesis 
has take n on the piopoi tion ol a small monogiaj>h, thanks to the ikwik^ 
of the important woi k on interests, it max he* we ll biiellx to summaii/e 
the results ol the* research which ha\c* bee n icxiewed in older to hi mg 
them into sharper locus before setting {01 til a iheoix ol intcicst x\ hie h 
they seem to justify. 

In sunnmoy , the inventoried inteiests ot lathers and sons ic-semhle c*ac li 
other about as much as do those ol iiateinal twins, whereas those ot ich 11- 
tical twins are considerably moie alike*, suggesting, since liaiernal t w in 
environments are more similar to each other than aie lather-son enxiion- 
ments, that heredity plays a part in the development ol intei ests. Interest 
patterns are related to degree of general intelligence, apparently because 
without understanding there cair be no genuine and enduring interest 01 
because a self-concept cannot endure unless it can be in part made a real- 
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ity. 1 here is no satisfactory evidence as yet concerning special aptitudes 
and interests. Attitudes sue It as liberalism and social adjustment are ie- 
lated to interest patterns, even piioi to occupational experience. This is 
true also of values, which are presumably more deep-seated aspects of 
personality. Personality adjustment in the sense of feelings of adequacy 
and security has not been shown to be lelated to interest patterns. There 
is some evidence that temperament and endocrine make-up may be re¬ 
lated to interest patterns, at least insofar as they affect masculinity and 
feminity, but the experiments in question are limited in number and in 
sc ope. 

Experiences such as courses in school and college, and staving in an 
occupation o\et a long period of time, have no effect on inventoried 
inter ests. although the experiences of the fust two years of college train¬ 
ing in a piolessionui field ha\e been shown, perhaps because of the im¬ 
portance of the: first real contact with a field, to have some effect on 
inventoried interests. 'I hose who leave a field of training while in college 
tend to undergo a decline* oJ interest in that field after leaving, and those 
who change to a field tend to show some increase in related interests after 
thev have made the change, but these changes ate not on the whole very 
great. I hew are significant enough so that it is possible that some persons 
do show teal changes of interests. 

I lhc<>)\ of mhit'sts which would take* into account all of the above 
lads, without going bevond them, must lecogni/c the significance of 
heredity, as shown in lamilv resemblances and as implied in the* data on 
aptitudes, pc t Miiialit \. and endocrine lactors; it must also recogni/e the 
role o! experience, as shown in the data on modification of inventoried 
interests with change of t\pe of experience. An adecpiate theory of inter¬ 
ests must build on the findings concerning the relationship between gen¬ 
eral aptitude and interest, which imply that in some instances aptitude 
piobablv does conu* lust, resulting in approval, satisfaction, and interest. 
It seems probable* that aptitude plays a par t in the development of per¬ 
sonality traits, as shown in certain studies of the effects of social skills on 
adjustment jtjS). and therefore in the development of interests 

as these are affected by personality. And it must recognize the fact that 
there are relationships between interests and the deeper layers of person¬ 
ality such as values and temperament, and possibly also personality traits 
and drives (although these last two relationships have not been and may 
perhaps not be established), and that these relationships are in some 
instances causal. In oilier words, an objective theory would recognize the 
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fact of multiple causation, the principle of interaction, and the joint 

contributions of nature and nurture. It would read more or less as 

follows: 

Interests are the product of interaction between inherited aptitudes 
and endocrine factors, on the one hand , and opportunity and social 
evaluation on the other. Some of the things a person does well bring him 
the satisfaction of mastery or the approval of his companions, and result 
in interests. Some of the tilings his associates do appeal to him and, 
through identification, he patterns his actions and his interests after 
them; if he fits the pattern reasonably well he remains in it, but if not. 
he must seek another identification and develop another self-concept and 
interest pattern. His mode of adjustment may cause him to seek certain 
satisfactions, but the means of achieving these satisfactions vary so much 
from one person, with one set of aptitudes and in one set of circum¬ 
stances, to another person with other abilities and in another situation, 
that the prediction of interest patterns from modes of adjustment is 
hardly possible. Because of the stability of the hereditary endowment and 
the relative stability of the social emironment in which any given person 
is reared, interest patterns are generally rather stable; their stability is fur¬ 
ther increased by the multiplicity of opportunities for try-outs, identifica¬ 
tion, and social approval in the years before adolescence. 1U adolescent e 
most young people have had opportunities to explore social, linguistic, 
mathematical, technical and business activities to some extent; they have 
sought to identify with parents, other adults, and schoolmates, and have 
rejected some and accepted others of these identifications; self-concepts 
have begun to take definite form. For these reasons interest patterns begin 
to crystallize by early adolescence, and the exploiaton experiences ol 
the adolescent years in most cases merely clarify and elaborate upon 
what has already begun to take shape. Some persons experience signifi 
cant changes during adolescence and early adulthood, but these are most 
often related to endocrine changes, and less often to changes in self 
concept resulting from having attempted to live up to a misident ification 
and to fit into an inappropriate pattern. Vocational interest patterns 
generally have a substantial degree of permanence at this stage; for most 
persons, adolescent exploration is an awakening to something that is 
already there. 




CHAPTER XVII 

MEASURES OF INTERESTS 


THE discussion of definitions at the beginning of the preceding chapter 
pointed out that the most pioductive work so far in the measurement of 
interests has been done with the inventory technique. For this reason only 
one test of interests is considered in this chapter, although it is hoped 
that others will be sufficiently developed during the next few years to 
justify later inclusion. This test is the Michigan Vocabulary Profile Test. 

As interest imentories have been developed in greater numbers there 
is, in that lespect at least, a broader field from which to choose. But it 
has apparent!) been much easier to write “like-indifferent-dislike” items 
than to asceitain what they measure and what their significance is for 
counseling and selection. Some interest inventory authors have launched 
then instruments without validation data and have not followed them 
up sufficiently to make them useful. Others, such as Garretson (279) end 
Dunlap (220,71s) at the junior high school level, and Cleeton (ifn) at 
the adolescent and adult, have made careful and intensive studies prior 
to or immediately aftei publication, but have not followed through with 
further investigations of the nature of the traits measured by, or the 
validity of, then instruments. Their inventories cannot therefore be con¬ 
sidered as more than potentially useful tools. One or two others, such as 
that by Fee and Thorpe (fy$j). inav in time be found useful, but data have 
yet to be made available to demonstrate their value. Any user of such an 
untiied inventor\ in counseling or selection operates on faith alone—and 
faith is a poor substitute for facts in psychology and in occupations. Two 
interest inventories and one values inventory have been studied over a 
period of years, and sufficient data have been accumulated to make them 
extremely valuable diagnostic instruments. As is brought out in the dis¬ 
cussion of the nature and development of interests, these are the Strong 
Vocational Interest Blank, the Ruder Preference Record , and the Allport- 
Vernon Study of Values. The first-named inventory has been the subject 
of intensive study from many viewpoints over more than twenty years, 
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its author having assumed responsibility for integrating and intei preting 
the results of relevant lesearch (775,798) ; the second was expel imented 
with for several years before publication and has since been icvised, and 
new studies by its author and others are continually appealing in the 
journals or in new editions ol the manual (4 jb.802); the last-named in¬ 
ventory has been used since 1933, during which time nmneious psycholo- 
gists have reported on it. and seveial have assumed lesponsibility lor 
bringing these reports together and discussing the significance ol the 11 airs 
measured (136,21b). These three imentories aie ihereloie tleak’d at some 
length in this chapter. Much brieler treatments of the elect on Luc atmnal 
Interest Inventory , and the Lee-T/iorpe Occupational Interest hh'cutnry 
are also included, as these are eithei widely used or new and well publi¬ 
cized instruments, some of which include some novel features. Also 
treated are trends observable in recent interest test constitution, as this 
is a very ac tive held and those who aie experimenting with interest meas¬ 
ures may find a briei discussion of some value, even though they aie not 
presently usclul to practitioners. 

Several studies have compaied existing inteiest inventories in order to 
assess their relative value. Some of these have used occupational criteria, 
effectively demonstiating the superiority of the Stiong over the Hepner 
and Rrainard inventories (8.]). But others have* compaied one test with 
another (e.g. 301) and with occupational prefeience, theicbv pioving 
nothing unless one is willing to postulate the validity ol one ol the indi¬ 
ces, the validity of which is in cjuestion. 

The Stwng L o( ational Interest Blank (Stanford Vniveisitv Pi ess. 0)27 
and 

The eminent group of applied psychologists assembled at the Carnegie 
Institute of Technology after World W ar I directed then attention pailly 
to problems in the measuiement ol inteiests. paiticulailv those which 
might differentiate salesmen fiom engineeis. The history ol this work 
has been recounted by Fryer (277: Ch. 3) and need not be repeated hele, 
beyond stating that Strong began his wen k with the inventoiy teclmicpie 
as a member of this group, and took it with him to Stanloid Lniveisity, 
where Gowdery (17.}) and other students worked with him in establishing 
it as an effective method of differentiating between occupational gioups. 
Strong published his first edition of the blank in 1927, altei seveial pie- 
liminary studies had shown the validity ol the appmach; a new revision 
that is currently in use was brought out in 1938, based on the work ol 
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the intervening years; the many studies ol the nature of the traits meas¬ 
ured and oi theii validity in educational and vocational counseling and 
selection weie hiought together in his monograph of 1943 (77-g new 
occupational scoting keys are added from time to time as studies are 
completed (Paterson is now revising the psychologist key, and Schwebel 
is developing one lor pharmacists); and the journals continue to carry 
new studies of various aspects ol the Blank’s significance and use. It is 
without cjuestion one of the most thoroughly studied and understood 
psychological instruments in existence. 

.1 fiffluability. Strong’s Vocational Interest Blank was developed for 
use with and standardized upon college students and adults employed in 
the professions and in business. Because ol this it includes some terms 
which are unfamiliar to high school students and to adults in lower level 
occupations. For example, even high school juniors and seniors filling 
out tlu- blank often ask the meaning ol terms such as “sociology,” “phys¬ 
iology,” and “smokers’’ and reveal a complete unawareness ot the nature 
or existence of the magazine System. For these reasons the cjuestion of 
the* use of Strong’s Blank with persons of less than college level lias Ire- 
ijuently been raised. It can be answered from two sets ol data: Strong’s 
and Carter’s studies ol the interests of adolescents and adults, already 
discussed in some detail, and a recent investigation by Stefllre (752). 

I he age and stahihtv studies, both cross-sectional and longitudinal, 
have- been seen to show that meaningful data can be obtained by means 
ol Strong’s Blank from bovs and girls as young as ly or 15, and that bv 
the time* the\ are iS to-20-vear-olds their Strong scores are rather well 
fixed. '1 his suggests that, despite the apparent difficulty of some of the 
words used in the inventory, it is sufficiently well understood at those 
age* levels to be applicable to most high school students. 

The vocabularies ol the Strong. Ruder, and other inventories were 
analv/eel by Stefllre. who reported that the Strong Blank has a 10th grade 
v o( abular v. I his fits in with the data on its usefulness with 17-year-olds, 
and suggests that it should be used below that level only with the more 
able and more advanced students. 

Potential users of interest inventories often ask whether a subjective 
technique such as this is subject to faking when used in selection pro¬ 
grams, and even in counseling, because of the desire to make high scores 
in some* occupations. The job applicant wants to ajjpear in the best possi 
ble light, and even if he is above conscious distortion there are many 
genuine opportunities to give oneself the benefit of the doubt in answer- 
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ing an inventory. The student seeking guidance may be eager for sell- 
insight and for an objective picture of himself, but in answering the 
questionnaire he is nonetheless guided by his self-concept, set of occupa¬ 
tional stereotypes, and a desire to appear favorably in the eyes of the 
counselor. Strong (775:684) and Steinmetz (751) have experimented with 
deliberate faking in students, first administering the inventory in the 
standard way, and at a subsequent session administering it with directions 
to attempt to raise the score on a specific occupation (engineering and 
school administrator, respectively). In both instances very great changes 
resulted, the mean scores shifting to such an extent that the majority 
received A ratings, in contrast with B-j-'s (engineers as engineers) and C’s 
(business students as engineers, education students as administrators). 
Other scores were affected by these distortions, as would be expected in 
view of the intercorrelations. 

Faking by job applicants, a much more important experiment than 
deliberate faking by students, was also checked by Strong (775:688-690) 
who administered the Blank to 118 men responding to an ad\ertisement 
which he inserted in newspapers. The inventory was given as a prelimi¬ 
nary hurdle for life insurance sales positions; some, at cot ding to Strong, 
took the questionnaire out of mere curiosity, but an indeterminate num¬ 
ber of others were more serious in their purpose. 'The scores made by 
these men in their then occupations were compared with their scores as 
life insurance salesmen, with the finding that only groups whose averages 
were above a standard score of 40 on the sales key weie already employed 
in some kind of sales work. The conclusion was that, although some 
individuals may have intentionally raised their scores somewhat, the 
majority did not achieve, or perhaps even try, any appreciable distortion. 
According to Strong (775:688), Bills did find that applicants under 24 
years of age who scored A on both sales keys were less likely to succeed 
as insurance salesmen than those who scored B-b, or B+ and A, on the 
two scales, presumably because of bluffing. In selection, thcicfore, the use 
of other checks on interest inventories is probably desirable. 

As for counselees, there is no experimental evidence that their scores 
are or are not affected by the desire to appear, to themselves or to the 
counselor, in a certain light. While Spencer (733) has shown that some 
personality inventory items are answered differently when a name is 
signed than when answered anonymously, he also showed that answers 
to other items, the least personal and the most like those in interest 
inventories, are not changed because the respondent can be identified. 
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Conscious distortion in counsclees or students can therefore probably be 
dismissed as negligible. No one has as yer found a way of checking up on 
unconscious distortion, although it might be tried under hypnosis or 
narcosis. 

Content. The Vocational Interest Blank Form M (men) consists in 
its present form of .joo items grouped according to type of content. The 
first group is a list of many types of occupations at and above the skilled 
level, emphasizing the business and professional fields. This is followed 
by lists of school subjects, amusements (games, magazines, sports, et(.), 
activities (hobbies, pastimes, etc.), peculiarities of people, vocational 
activities, factors affecting vocational satisfaction, well-known persons 
exemplifying occupational stereotypes, offices in clubs, and ratings of 
abilities and personality characteristics (the actual grouping is not cjuite 
as in this list, which is based on content rather than on form of item). 
The women’s foim has 263 items in common with the men’s and a total 
of ,joo in the revised lorni. 

Administration and Scoung. There is no time limit for the Voca¬ 
tional Interest Blank, as the task is to answer all questions; the time 
lcquired ranges from a little over 30 minutes for superior, well-adjusted, 
adults to something more than an hour for less able or less stable indi 
viduals. It is well to allow an hour when testing groups, and to admin¬ 
ister the blank at the end of a test session, e.g., just before a rest period, 
when it is part ol a battery. This makes it possible to dismiss subjects as 
they finish, but does not put too much pressure on those who have not 
finished. In guidance centers the inventory is often given to older adoles¬ 
cents and adults to complete on their own time at home; this works well 
when the client has a place to work without having his responses affected 
bv the comments of on lookers, and when he understands the importance 
for himself of filling it out rapidly and without consultation. 

In answering the blank, the subject marks each item according to 
whether he likes, dislikes, or is indifferent to it. The answer to each item 
is assigned a weight based on the degree to which the answers of men 
in a given occupation, e.g. engineering, differ from those of men-in- 
general. This procedure is sufficiently different from those normally used 
in developing scoiing procedures to be worth describing, for understand¬ 
ing it means practically an understanding of Strong’s Blank. Table 28 
presents the Strong’s data for one item, “Actor,” showing the responses 
of engineers and of men-in-gcneral. 

It is made clear by the “difference” row in Table 28 that engineers are 
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Tame 28 

DETERMINATION OF WEIGHTS IN STRONG’S BLANK: ITEM “aCIOR” 


Croup 

% Like 

% Indifferent 

% Dislike 

Engineers 

9 

:u 

60 

Men (Gen’l) 

21 

82 

47 

Difference 

— 12 

— 1 

13 

Weight 

— 1 

0 

1 


less likely to indicate a liking lor the occupation “actor” than are mcn- 
in-general, slightly 11101 e liheh to indicate indiflei em e, and much more* 
likelv to show a disliking lor it. Bv means <>1 a lormula based on the 
significance of the difference between two peicents these data aie con 
\erted into the weights shown in the bottom low. In scoring the inven 
lory of a voting man who thinks he wants to be an engineei, but who 
indicates that he would like being an actor, one would theiefoie deduct 
one ])oint born his engineering score: he has shown that, in this respect 
at least, he is more like other men than like engineers. It is pci haps woith 
noting that this is true, even though other men tend not to like* being 
an actor, for thev indicate a liking for it more often than do engineers. 

The score foi engineer, then, is the algebraic sum of the weights cor 
responding to each answer maikccl bv the* client, a total of .joo weights. 
\ comparable addition and subtraction must be made for e\eiv occupa 
tional or other sc01c (e.g., masculinitv-feminity) desiied by the counseloi. 
To do this bv hand is time-consuming, for it takes a novice about 1 r, 
minutes to scoie one blank for one occupation evc’ii with the- stencils 
provided for this purpose, and the men's inventory is scored for more 
than 40, the women's for more than 20, occupations and traits. With the 
aid of two Yeeder counters (Nos. ZD-18-T and ZD-S-T, Yeeder Mfg. Co., 
Hartford, Conn.) an experienced scorer can cut the time in half, averag¬ 
ing about ten occupational scores per hour. As this would still mean 
about lour bonis scoring time* per men’s blank when all kevs aie used, 
machine scoring is necessarv when anv number of subjects and occupa¬ 
tional scales aie involved. 

Strong describes the methods in his manual. Briefly, they are the Hol¬ 
lerith machine, leading the answers from the Blank, at a cost of about 
$1.25 per blank for 4c) men’s occupations (the price varies); the IBM 
method, in which a special answer sheet and elec trographic pencil are 
used (as with most standard tests for machine-scoring), at a similar cost 
for all men’s scales; and the Hankes method (Engineers Northwest, 100 
Metropolitan Life Bldg., Minneapolis 1, Minn.), requiring a Hankes 
answer sheet, at 70 cents per blank for all j2 cm lent me n’s scales or all 
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2.| turn nt women s stales. I he names and addresses of organizations 
having scoring machines and offering scoring services to others are listed 
in Strongs manual, which is kept up to date; the Hankes’ method is 
described in a paper by Strong and Hankes (782). 

I he cost cjf scoring Strong’s Blank has been something of a deterrent 
to its use in some institutions, more often in public schools than else¬ 
where, and the more recently published inventories with their less ex¬ 
pensive scoring have for this reason had a w r ide appeal. Many a user has 
bought them franklv as Jess expensive substitutes for Strong’s Blank. 
Because of the pressure to cut down costs, Strong and many others have 
attempted to simplify the scoring as much as possible; these have been 
summarized in Strong’s book (^5: Ch. 2 j) and followed up by another 
studs (77S). Only Strong’s conclusions can be cited here: weighted scores 
differentiate better than the unit scores proposed by Dunlap and others 
and should therefore be used in counseling and selection. Weighting each 
item one, instead of horn — } to .j, would. Strong has shown (778). lead 
to different counseling in bom one in every twelve to one in every six 
cases. When the* cost is approximately one dollar per case the price of 
gi eater validitv does not seem unduly high. Public: schools and other 
institutions spend far more per pupil on things of less significance than 
finding out what kinds of educational and vocational activities are most 
likclv to challenge them. As a compromise. Strong has devised six group 
scales, one each loi the biological science, phvsical science, social vvellaie, 
business detail, contact and linguistic groups. These correlate lairlv well 
with the specific kevs and can be used when only directional counseling is 
needed. 

Score's on Strong’s Blank arc recorded in many different ways by useis 
of the inventor v. One frequently sees reports in which the occupational 
scores are arranged in order of magnitude, all occupations in which A s 
arc- made be ing grouped first, the B + ’s next, and so on. This is done on 
the assumption that the counselor and client are most interested in the A 
scores or, in their absence, in the B-k’s. This method has two drawbacks: 
it focuses attention on specific occupations, and it makes it difficult to 
perceive patterns of scores. Each of these is worthy of brief discussion. 

The Vocational Interest Blank can be scored for about 40 occupations, 
and the number may conceivably be increased to 45 or 50 in due course. 
But there are nearly 40,000 jobs in the Dictionmy of Occupational Titles 
(888). and. while rnanv of these are more specific than those in Strong’s 
Blank, ami could be combined to make a smaller number, it would still 
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be true that interest in most occupations cannot be scored on Strong’s 
Blank. It is manifestly unwise, then, to play up scores on specific occupa¬ 
tions. The result too often is that a student says, “I rate A as a minister, 
but I don’t have any desire to be a minister,” and the insights into in¬ 
terests which might be gained from that score are lost in the negative 
reaction to a stereotype of a specific field; or a client leaves the counselor 
and reports to his family that “One test showed that I should be a person¬ 
nel director. I wonder what die boss will think of that?” missing the 
more general implications of that high score. 

When occupational interest scoies arc grouped according to their 
factorial composition, however, the result is often quite dillcrcnt. This 
puts related occupations together in families; it permits the analysis of 
scores in terms of types of occupations rather than specific occupations, 
and it makes it easy to see whether or not a high score in one occupation 
is supported by high scoies in related occupations. Thus an A as physi¬ 
cian, for example, is a much surer basis lor encouragement in choosing a 
premedical or biological sciences major if supported by A s oi B-f’s as 
psychologist, dentist, chemist, and engineer than if the scores of these 
occupations are largely B-f’s and C’s. The report sheet published by 
Strong, the Hankes Report Form, and many others are organized in such 
a way as to make possible this type of pattern analysis (see Fig. 7). 

Pattern anahsis was first described in some detail in a booklet b\ 
Barley (189: Ch. 2), in which a helpful distinction is made between 
primary , secondary, and tertiary interest patterns. He defines as primary 
interest patterns those fields in which the letter ratings received are 
largely A’s and B-f’s, as secondary patterns those occupational families 
in which scoies are predominantly B-f and B, and as tertiary those in 
which thev tend to be B’s and B —’s. Using this classification of the letter 
ratings received by a counselee makes it possible to focus attention on 
the kinds of occupations which he is likely to find congenial. It is 11101 e 
helpful to know, few example, that his primary interest patterns are in 
the scientific and literary occupations with a secondary pattern in the 
social welfare field, than to know that he made A’s as psychologist, physi¬ 
cian, physicist, chemist, engineer, personnel director, public administra¬ 
tor, advertising man, author-journalist and president of a manufacturing 
concern. Darlev tabulated the frequency of interest types or patterns for 
1000 men at the University of Minnesota; it is w T orth noting that ap¬ 
proximately half of these men had no primary interest pattern. This 
is discussed at great length in connection with the use of Strong’s Blank. 



OCCUPATION 

Artist 

Architect 

Psychologist 

Physician 

Os teopath 



Figure 7 

INTER I SI PROF IIE OI' A YOUNG MAN AT AGE 23, SOCIAI STUDIES 
INSTRUCTOR. AND AI AGE 37. AS PSYCHOLOGIST (STRONG’S RI ANKV 
(Broken lines at 23, solid lines at 37). 


It should also he noted that, as might he anticipated, the use oi interest 
patterns has been found more valid, with entry into an occupation as 
criterion, than specific occupational score (926). 

Xorms. The question of norms is in fact a double-barrelled question, 
for it concerns both type and number of cases. As the details oi both ol 
these are given in Strong’s manual and in his book ( 77 .V f) 9 thc > 
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need not be reproduced here, but one frequently encounters mis-state¬ 
ments made by presumably well-informed users ot Strong’s Blank. For 
example, a specialist in the selection and training of nurses once stated 
that the nursing scale of the women’s form was of little value because it 
was based on about too nurses from one hospital in Chicago; it was 
actually based on approximately .joo nurses, 283 located in, but not 
necessarily nati\es of, the nurse-importing city of New York, the* othei 
1 17 from upstate New Yenk and elsewhere. This is not a balanced sample, 
but neither is it as unbalanced as the above-quoted critic implied; as 
data on the \alidit\ ol the women’s lorm will later make clear, it is also 
not the reason loi the lack ol correlation between scoies and grades in 
nursing schools. Some generalized statements concerning the nouns ol 
the Vocational Interest Blank follow, in order to pto\ide the otientation 
to the base ol this imenton which many users appaientlv lack. 

File data concerning numbers are simple enough. Cowclen’s cailv 
work (171) showed that occupational chilerentiation could be* achieved 
with groups of as lew* as 37 persons. Foi this 1 cason, Strong's fust ke\s 
were based on about 170 cases each, surely a conset\ati\e application ol 
Cowdcry’s findings. Subsequent woik (775-'f)g(j-r»;,o), howe\ci, led him 
to inc tease the number in order to increase the disc 1 imin.iting power ol 
the test; the mmibeis weie theiefote laised fust to 270 cases per occupa¬ 
tion, and then to between .joo and 500 cases. Accordinglv, the earliest 
scales are based on gioups ol horn 170 to 200 cases per occupation, the 
newest scales on between .joo and 700 peisons. K\idence lepotted b\ 
Strong shows that these numbeis ate* large enough to minimi/e shrinkage* 
of mean scores in cross-validation. 

I he quc*stion of type or cjuality is moie complex, and can itself be 
broken down into several cjuestions. Outstanding among the se* ate, 1) tlu* 
critcTion of success which warrants inclusion ol a gi\en case* in the* ctitc- 
rion group, 2), the repiesentativeness of the sample, and, 3). the* timeless 
ness of the sample or the degiee to which the interests ol successful 
psychologists ol ic) 28 ate icpresentative of the* interests of successlill ps\- 
chologists in 19 j8. 

The criterion of success varies from one occupation to another, as one 
might expect in \iew of the differences in occupations; output mav 
measure success in electrical unit assembly work, but not in teaching. 
Strongs life insurance salesmen sold at least Si00.000 worth of insmance 
annually for three yeais. In the cases of occupations which have some 
accrediting or other evaluative procedure of their own it was used; 
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architects were members of the state board of architecture, carpenters 
weie union members, certified public accountants were certified in their 
states, chemists were notr-professorial members of the American Chemical 
Society, and psychologists were Fellows (full members) of the American 
Psychological Association. When no such criterion of being established 
in a field was available, other evidence of status was used: the journalists 
were editors listed in Editor mid Publisher Yearbook, city school super¬ 
intendents were employed in cities of more than 10,000 inhabitants, 
personnel managers were “carefully selected by competent authorities,” 
and YMCA pin sic al directors weie selected by a YMCA college. All 
members of criterion groups had been employed in their occupation for 
ai least three ycais, and none were over ho years of age. In some cases 
the criterion was probably not as stringent as in others: the apparently 
miscellaneous collection ot office woikers were probably not as highly 
selected, in their field, as the physicians who graduated from Yale and 
Stanfoid weie in theiis, especially when it is considered that the great 
majoiitv of the* phvsici.ms practiced in favored aieas. Sometimes the 
cl iter ion was established and the group selected by Stiong (e.g., psycholo¬ 
gists) and sometimes it v\as others who did these things (e.g., men 
t cmc heis). On the whole, however, there is little to quarrel with in the 
c 1 iterion. 

The Jepieseutat/eene^s of the sample is more difficult to judge, and has 
not been investigated sufficiently to provide answers to the serious ques¬ 
tions which can be 1 aised concerning some of the groups. The purchasing 
agents were located in Northern California, Los Angeles, Washington. 
D.C., and Cleveland: the psvc hologists were scattered throughout the 
United State's and constituted ;u> percent of the population from which 
tlicv weie drawn: the- personnel manageis came* fioni New England, the 
Middle Mlantic States, the* Cleat Lakes, and the Pacific Coast: and the 
citv school supei intendents were located in various parts of the States: 
these seem reasonable likely to prove re presentative. But the male social- 
science teachers were all hour Minnesota, and may have had inteiests 
quite dilleicut fionr those of their confides in Vermont and Alabama: 
the real estate salesme n were all from California, and mav differ con- 
sideiablv from those of Massachusetts, though no doubt very much like 
those of Floiida (!); and the burners were all from the West Coast, and 
perhaps unlike those of Maine and Georgia. No check has been made of 
the possible existence of regional differences in occupational interest 
patterns, not even of the existence of regional differences in the' interests 
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of men-in-general. Strong does report an unpublished study by Pallister 
and Pierce (775:(>7^—677) which compares tlie interests of Scotsmen with 
those of Americans; the former were artists, journalists, ministers, and 
policemen living in or near Dundee. The interests of Scottish artists and 
policemen were very much like those of their American counterparts, 
while those of ministers (possibly) and journalists (clearly) were dillerent. 
Strong concludes that the differences cannot be attributed to language 
usage, since at most two occupational groups differ; he believes that they 
are due to differences in sampling, and points out that the American 
journalists were a highly selected group (listed in Who’s 117 / 0 , etc.) while 
the Scots constituted all the literary employees of one local publishing 
house. There is also the possibility of national, and therefore regional, 
differences in the selection of persons in some occupations. Second- 
generation Japanese high school boys, born in America, were found to 
resemble white Americans in their interests in another study reported by 
Strong (77:y<>77-h79), leading him, as in the case of the Scottish study, to 
conclude that richness of meaning has little effect upon responses to 
interest items, and that the omission of some terms which are not under¬ 
stood docs not appieciablv affect interest inventory scores. Although they 
have only indirect bearing on the question of regional differences in 
vocational interest patterns in the United States, these findings do 
indicate that the latter may not be as important as a priori reasoning 
might suggest. It is to be hoped that investigations ot the differences 
between social studies teachers in the Midwest and East, ministers in the 
Middle Atlantic and Southern States, and other regional occupational 
groups will in due course be made. 

The temporal validity of Strong’s occupational interest scales, like 
their regional validity, has not been the subject of published investiga¬ 
tions. Professional self-consciousness and the rapid development of their 
own profession have, however, made psychologists conscious of the 
problem. It is frequently pointed out, for example, that Fellows of the 
American Psychological Association in 1928 were largely laboratory 
psychologists, more interested in problems of mental organization and 
functioning as shown in introspective or experimental studies of learning 
in humans and in animals than in problems of human adjustment, and 
that, in contrast, the tremendous growth of industrial, educational, and 
clinical psychology in recent years row puts the heirs of these theor etical 
psychologists in the minority. The interests of the two generations of 
psychologists may be quite different, or, on the other hand, the common 
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core of interest in the scientific study of man may be so important that, 
when compared with other professional men, they may seem quite 
similar. As mentioned in the first section of this chapter, Paterson is 
conducting a study which will throw light on the problem of interests in 
a changing profession. If he reports little change, other keys may be used 
with some confidence, for psychology appears to have been changing 
more than most occupations; if he does find differences, caution will be 
needed in the use of scales for other occupations which may ha\e 
changed. Inspection of the list suggests only one other which may have 
been affected in a comparable way, that of YMCA secretary, in which 
profession the emphasis seems to have changed from personal-religious 
to social. 

Another aspect of the norming of the inventory which needs considera¬ 
tion is the form in winch it s scores are expressed and the normative group 
to which it compares a person. Strong provides distributions of raw 
scores based on the appropriate ciitcrion group (e.g., engineers), standaid 
scores and letter grades lor the same groups, and percentile scores for the 
criterion group, Stanford fieshmen, and Stanford seniors. I'll is plcthota 
of norms raises the question of which to use. As was pointed out in an 
earlier chapter, genetal nouns such as those of the student groups have 
little \alue, for they tell nothing of the individual’s prospects of success 
in competition with selected occupational gioups. It is therefoie the 
norms lor the criterion group which should be used. As is pointed out 
in the manual, the letter ratings have the advantage over percentiles and 
standard scores in that they indicate clearly and readily whether or not 
a person’s interests resemble those of men or women successful in the 
occupation in question, without obscuring the issue with the problem 
of little understood differences of degree. For although the difference 
between the- both and 90th percentiles on an aptitude test has known 
significance, that between the saute percentiles on an interest inventory 
such as Strong’s does not; there is no reason for thinking that a high 
degree of resemblance to the interests of the average successful workei 
is superior- to a moderate degree of resemblance. The man at the both 
percentile might actually differ from the average successful worker in 
wavs which make him more like the most successful or satisfied wotkcis 
than the man whose 90th percentile rank indicates closer resemblance 
to the average established man. In both counseling and selection, there¬ 
fore, it is better to use the letter ratings. 

These are so established that the top 69 percent of workers in the 
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occupation arc assigned scores of A. and the bottom 2 percent ate as¬ 
signed scores of C. Thus anyone icsembling die* majority of established 
woikcrs is assigned an .A or at woist a B | , and all persons who rale* C 
on an occupation are ejuite unlike the bulk of men in the field in epics 
tion. Scaled scoters mav he useful in certain types of studies, as when 
differences between groups are being studied. 

Standai dization and Initial Validation . Much of what might notmally 
be discussed under this beading has already hern treated under the sec¬ 
tions on scoiing and norms, because of the unique nature of the* Voca¬ 
tional Interest Blank as an inventor) based on group diHereticcs and 
scored by difletent kevs lot each occupation on which it has been stand¬ 
ardized. Tt is also difficult, in a sense, to distinguish between initial 
validation and subsequent validation, because of the basic natiue of 
some of the later studies which Strong has made of his inventory. How¬ 
ever, it will clarify matters briefly to outline the steps gone* through in 
the first validation of each occupational scale of the Strong IBank. 

Hie Blank itself, it will be remembered, was the tesult ol several 
vears of experimentation by Strong and bis associates and students: his 
list of .j20 items, later abbieviated to .joo, consisted of those which had 
been found most useful in these various studies. In devising the scoring 
scale for each occupation the inventoiv was administered, often by mail, 
to men or women who had been in the occupation in question for at 
least three veais and who, in most cases, were distinguished b\ having 
been nominated by well-informed pel sons as leaders in their fields, bv 
being listed in the appropriate Who's IVho, or by professional certifica¬ 
tion. The scores made by these 1200 to 500 persons (see section on Srm/ng 
for methodology) were distributed on the normal curve and converted 
into standaid seoies and letter grades. It should be noted that in this 
proccduie the norm group consists of the same persons who constituted 
the ctilerion group, and that experience has icpeatedly shown that when 
these two gioups ate the same, the mean score's of subsequent groups will 
be lower than those of the norm, even though all gioups are random 
samples of the same population. 1'his has been noted and pointed out bv 
Strong (775: <> jq, 675), in experiments which showed that, when the crite¬ 
rion-norm group consists of 250 persons, the shrinkage of scores will be 
about 1.50 standard scores. He felt that it was wise to continue' this 
procedure, however, in order to have the largest possible criterion groups. 
This was justified by the fact that the shrinkage for a criterion group of 
‘joo was only 0.90 standard stores. As there was very little' change lot 
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numbers above 300, Strong’s choice of criterion groups of between 400 
and 500 seems wise, but users ol the blank must allow lor a shrinkage of 
about one standard score—not enough to be important in most indi¬ 
vidual instances, but at times making the diflerence between a B and a 
B + , a B-f and an A. r The experience of psychologists during World 
War II, when, lor example, norms were regularly gathered lor 400 
aviation cadets per day in one center, brought home the need lor cross- 
validation as a means of avoiding shrinkage: even supposedly similar 
groups ol 1000 cases frequently showed significant differences. It is there¬ 
fore to be regretted that Strong did not correct his norms by the amount 
appropriate to the number ol cases used, or subsequently obtain data 
from new normative groups. 

'The procedure described above is the standard method of developing 
an occupational scale lor Strong’s Blank. Many other studies have been 
conducted which validate the inventory, either through cross-validation 
or b\ other means: we ha\e, lot example, seen studies of the validity ol 
responses (hiking), age changes, and the effect ol experience. Other 
studies ha\e considered the relationship between interest-imentory scores 
and grades or sales production, but as these fit more naturallv into the 
discussion of field \alid,ition they will be taken up after the next section. 

Reliability. The odds-e\ens reliability coefficients of 36 of the reused 
men’s scales are reported in the manual as a\eraging .88, based on the 
lecords ol 28', Stanlord seniois; only one coefficient was below .80, die 
reliability of the Cil’A scale being .73. Taylor (813) found retest reliabil¬ 
ities a\eiaging .87 lor high school boys and .88 for girls, on the appro¬ 
priate lor ms. The retest reliability was ascertained lor college students 
by Burnham (124) with eight ol the original scales, the average being 
.87. Strong obtained retest reliabilities alter live years, the first testing 
being when the 2S7 men in cjuestion were college seniors: these averaged 
~- )t an( i must he thought ol as not only an index of the reliability of the 
scales, but also as a measure oi the stability of interests in early adult¬ 
hood. For ioth graders retested alter two years the mean for 7 typical 
scales was .^7 (i;r,): lor nth graders alter three years it was .71 (813). 
It is evident that the scales are reliable enough for confident use in 
individual diagnosis at least after age 17. 

Validity. The validity of Strong’s Vocational Interest Blank has been 
investigated by relating the scores of its various scales to those of other 
tests, to grades in school and college, to completion of training, to carm 
irrgs in sales work, to ratings of success in various tvpcs of work, to per< 
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sistence in an occupation, to differences between occupational groups, 
and to job satisfaction. As this suggests, there has been accumulated an 
unusual amount of validation data, even for an instrument which has 
been in existence for twenty years. 

To attempt to review all of these validation studies would be not only 
a sizable task [Strong’s monograph (775) attained 7-p) pages even after 
whole sections had been ruthlessly cut], but, because of Strong’s thor¬ 
oughness, an unnecessary one. There are, however, two reasons for dis¬ 
cussing certain selected studies here at some length: 1), an understanding 
of these details is essential to adequate use and interpretation of Strong’s 
Blank, and, 2), they should become an integral part of the literature on 
vocational tests in order that the first objective may be attained. Some 
of the studies discussed by Strong are therefore treated here, together 
with others of special significance which have appeared since he com¬ 
piled his review. 

Tests of intelligence have been correlated with Strong’s scales in eight 
studies sunnnari/ed by Strong ( 77 fd 333 “ 3 ‘M) an d in others subsequently 
published (77!)). The various investigators agree that the lelationships 
with scientific and linguistic interests are positive, the former being 
moderate or low but significant and the latter so low as to be of little 
meaning, as shown in Table 29. 

The correlations with social welfare and business interests tend to be 

Table 29 

RELATIONSHIP BETWEEN INTELLIGENCE AND INTERESTS 

(Taken from Strong, Tabic 90) 

Occupational Scale Correlations 


Psychologist 

•37 

•43 

•41 

.36 

• 1 5 

,38 

Physician 

.16 

.27 

■ •9 

04 

10 

•24 

Engineer 

.21 

.20 

• 1 4 

- 1 / 

.08 

28 

Chemist 

•3° 

•34 

•15 

■31 

•°3 

■35 

Advertising Man 

.02 

.14 

. 12 

--.11 

•45 

.01 

Lawyer 

.07 

.21 

.20 

■13 

•39 

•13 

YMCA Secretary 

— .22 

~ l 9 

-.18 

.14 

“ • 1 5 

-.18 

Personnel Manager 

-.16 

— .10 

-.13 

.27 

-.07 

— .02 

City School Superintendent 

— .12 

•°3 

.01 

•32 

.06 

— .06 

Office Worker 

~-3i 

-.27 

-.28 

.09 

-.38 

-•25 

Purchasing Agent 

-.25 

-•33 

-.31 

.OO 

-.07 

— .21 

Life Insurance Sales 

-•35 

“•34 

-.31 

-19 

.00 

— .2t) 

Vacuum Cleaner Sales 

-.36 

-.40 

-.40 

-.14 

-.36 
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negative, although most of the coefficients are so low as to make them ol 
little practical significance despite their theoretical implications. Typical 
data are leproduccd in the lower half of Table 29. It will be noted that, 
although there are occasional discrepancies, there is sufficient agreement 
so that analyzing the nature of the groups and of the intelligence tests 
used in an attempt to reconcile them is unnecessary. 

Special aptitudes have not been correlated with scores on Strong’s 
blank in many published studies. 'Fhc* relationship between the Stanfoid 
Scientific Aptitude Test and Stiong’s six group scales was ascertained In 
Long (|/S) for 200 college students although, as he points out, it is not 
at all certain what the former test mcasmes. Me iound significant positive 
correlations (.26 and .50) with Strong’s two scientific scales, and a signifi¬ 
cant negative* relationship with the business-contact scale ( — .37); the 
others were negligible, and none could be explained on the basis ol 
intellectual difference's as measured by the A.C.E. Psychological Examina¬ 
tion. Leflcl (j(io) correlated scores on the O’Rourke Mechanical Aptitude 
Test with Strong scores, showing positive relationships (.42 and . ]fi) 
between the O’Rourke and the keys for chemist and engineer, and nega¬ 
tive* relationships ( — .25 and —.25) with the scales for social studies 
teacher and lawver. Holcomb and Laslctt (375) found comparable results 
for the Stencjuist Mechanical Aptitude Test. This suggests that aptitude, 
being more fundamental than interest, may have some causal effect on 
the latter, but as noted previously the O’Roiuke and the Stencjuist arc 
information tests the scores of which are no doubt influenced by both 
aptitude and interest, making it impossible to infer causal connections. 
The latter study also used the MacQuarrie, which correlated .22 with 
Strong’s engineer scale. Moore’s study (536) showed correlations of .30 
and .35 between the Bennett Mechanical Comprehension Test and the 
engineer kev, and of .21 and .2(> with the aviator scale, while the correla¬ 
tions between Bennett and Strong production manager and carpentei 
scales were negligible. As the MacQuarrie and Bennett tests are more 
strictly measures of aptitude than the O’Rourke and Stencjuist, it mav 
peihaps be inferred that aptitude plays a part in the development of 
interest. This seems warranted desjaite Klugman’s (435) contrary finding 
with the Minnesota Clerical Test and Strong’s women’s clerical keys: 
clerical interests appear to have too little significance in women to attach 
imjmrtance to findings based on them (see p. 436). 

Interest and values inventories with which the Strong Blank has been 
correlated include the Allport-Vernon Study of Values and the Ruder 
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Preference Record. Data from three studies of college students , one of 
men (667) and two of women (188,217) show that the relationship be¬ 
tween biological science interests and theoretical and aesthetic values is 
positive, while that with economic and political values is negative; the 
relationship is positive between physical science interests and theoretical 
values, and negative between these interests and political values; social 
welfare inteiests and social and religious values positive; business contact 
interests and economic and political value’s positive, the theoietical value's 
negative; literary interests and aesthetic and theoretical values positive, 
economic negative; and business detail inteiests and economic and 
political values positive, theoretical negative. The theoretical significance 
of these relationships was discussed in connection with the nature of 
interests. 

As the Strong Blank is the better understood of the two interest inven¬ 
tories, the discussion of its relationship with the Ruder Preference 
Record will he postponed until that instrument is the locus of attention. 

Personality nuunitary scoies have been related to Stiong’s scales with 
results which are somewhat contradictory. T hese studies are discussed in 
the preceding chapter and, as theie is little in the way of generalizations 
to be drawn from them which is of value in using Strong’s Blank, they 
are not sunnnaiized here. 

Grades and scores on achievement examinations have frecjuently been 
correlated with scoies on Strong’s inventory in the hope that the piedic- 
lion of educational success would thus be improved. The piedic tive value 
of scholastic aptitude tests being lar from perfect, it was reasoned that 
motivation might account for part of the discrepancy, and the motivation 
and interest should oveilap to some extent. Accordingly, a number of 
studies vveic made, many of which were not published and only a few 
of which are cited here. Townsend (851) ascertained the relationships 
between Strong’s scales and scores on objective tests of school achieve¬ 
ment made by groups of 50 to 100 boys in private secondaiy schools, and 
reported that they were few and significant only in the case of mathe¬ 
matics-science teacher and chemistry (r = .36), accountant-chemistry (..jc)), 
CPA-chemistry (.42), and mathematician-geometry (.31). Achievement 
in English and history were not related to social science teacher or 
author-journalist interest scale’s. A procedure used by Segel (702) is sug¬ 
gestive, for after con elating Strong’s scales with Iowa High School Con 
tent Examination Scores and obtaining correlations between scientific 
scales and scientific subjects which langed from .28 to .,jc), but which were 
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not significantly related in other expected ways except for some negative 
ielationships, lie piocceded to use differential achievement scores. T hese 
((insisted of the differences in the scores of two achievement tests, for 
example, the diflerence between literature and science, which had a 
correlation of .25 with life insurance sales interest. The correlations, 
both positive and negative, were generally higher, those for scientific 
interests and differential scientific achievement (c.g., science minus his¬ 
tory and social science scores) ranged from .29 to .57. Similar relationships 
were found when school grades were used instead of achievement test 
scores in this study, although the trends were not so clear cut, presumably 
because of the more numerous other factors which affect grades. The 
reason for the relationships between differential achievement and in¬ 
terests being greater than the relationship between achievement and in¬ 
terests is that, in the former, the effects of general ability are held 
constant and those of differentia] motivation and application are em¬ 
phasized. If a student consistently makes B + in one field, and A in an* 
othei, the relationships with interests will not be clear, but if his relative 
supcrioiity in the second subject is brought out and correlated with in* 
teiests, and the ielati\e inferiority ol his performance in the former sub¬ 
ject is similarly treated, the role of interest is more likely to manifest 
itself. 

Grades in college were related to scores on Strong’s scales by Altcneder 
(i.j), who wm keel with freshmen at New York University. The 1 elation- 
ships weie low (r’s langcd from —.28 to .30), but she reported that low 
scholarship men tended to make higher engineering interest scores than 
high scholarship men, who tended to have* interests more like those ol 
teacheis and (TA’s, while low scholarship women had interests some¬ 
what more like those ol instil ante saleswomen and stenographers than 
high scholarship women, whose scores as librarians, social workers, and 
lavvyeis tended to be high. 

T yping and stenography grades of about 100 women liberal arts college 
students were studied with the women’s stenographer scale by Barrett 
(jf>) at Hunter College. She reported only the data from tests which 
showed some validity; as the Strong scale “failed to show any significant 
1 clationship to grades” data concerning it were not reported. 

Engineering grades were correlated with Strong’s engineering scale by 
Berdie (78), Campbell (775:521), and Holcomb and Laslett (375). In the 
first study tile honor-point ratios of 151 University of Minnesota students 
were the criterion; their correlation with interest scores was .13. It is 
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worth noting that having a variety of interests, rather than only scientific, 
had no detrimental effect on achievement. In the second study the cor¬ 
relation was .32. In Campbell’s study of 270 engineering students at 
Stanford it was .185. For this same group the correlation between social- 
science interests and grade-point ratios in social science was .31 (social 
studies teacher). Holcomb and Laslett reported a similar correlation 
(.32) between engineering interest and engineering grades. 

Dental grades were used as a criterion for the dentist scale by Robinson 
and Bellows (634), who found a significant relationship (r — .13, .18, . 1 9) 
Data on 141 dental students were reported by Strong (775:523), who 
found that those rating C on his scale made inferior grades (grade-point 
ratio of 2.01), while those of others were slightly higher (2.41 to 2.58). 
The significance of the difference is not reported. 

Medical school grades were subjected to study by Douglass (205) and 
by Jacobson (397), the former reporting that Strong’s Blank was not use¬ 
ful in predicting success, the latter, however, finding that the first-year 
grades of students who were characterized by scientific and other interests 
were better than those with other interest patterns, those with medical 
interests as their only strong scientific interest but with other t\pes sup¬ 
plementing these ranked second, and those with no scientific interests 
ranked at the bottom. In connection with the top-ranking, broad interest, 
students, it is interesting that Berdie (76) found that those with mam 
“likes” get better grades than those with few “likes.” 

Teachers’ college students were the subjects of studies by Goodfellow 
(295), Mather (775:52b), and Scagoe (668). Goodfellow found, as Strong 
did with dental students, that those who rated A on the appiopriate 
scale made better grades than those who rated C, the differences being 
significant. Mather and Seagoe, however, both found no relationships 
between grades and interests. 

These contradictory findings in different studies of the relationship 
between interests and achievement, reported even for the same subject- 
matter or professional fields, might be explained on the basis of the un¬ 
reliability of the criteria in some studies, the limited range of interests 
in most schools, and perhaps other factors which vary from one institu¬ 
tion to another. It is interesting that in none of the published studies has 
the criterion been subjected to anv scrutiny, either as to distribution of 
grades or as to reliability, although in numerous studies (e.g., 929) it has 
been demonstrated that the apparent lack of validity of the predictor is 
attributable in the first place to lack of reliability in the criterion. The 
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limited range of scores in professional-student groups has been com¬ 
mented on by Strong (775:525—526), who contrasted the percentages oi 
dental students receiving the various letter ratings on his scale with the 
percentages of college students in general: the latter received fewer A’s 
and more C’s. This phenomenon suggests the need to use an approach 
other than the correlational in studying the relationship between inter¬ 
ests and educational achievement, and perhaps a criterion other than 
grades. Several studies have used another approach, but before describing 
their results attention should be focused on one study of interests and 
grades in which the range of the former was relatively great. 

Personnel-psychology students in the Army Specialized Training Pro¬ 
gram, 95 in all, were the subjects of a study by Strong (776), after the 
publication of his book and the expression of opinions which might have' 
been somewhat modified had this study been completed first. In this 
investigation the correlation between intelligence and grades in psy¬ 
chology courses was only .20, whereas that for psychologist interest was 
.275. Neither of these is quite significant (.31 required), but when indi¬ 
vidual course grades are considered the picture is clearer: correlations for 
testing and social psychology courses were .355 and .150 for intelligence, 
aird .32 and .3] lor psychologist interests. As the tendency in other 
studies has been lor intelligence tests to be considerably better predictois 
than interest inventories, possible reasons were investigated. It was found 
that, since the soldiers in question had all been selected partly on the 
basis of scholastic aptitude (minimum score of 115 on the AGCT or in 
the top quarter of white soldiers), the range of intelligence in the group 
was restricted: in fact, 90 of the 95 made scorns of 120 or above. In the 
typical college freshman class, however, the tail of the distribution does 
not end so abruptly (see Chapter 6). The range of interest in psychology 
was, however, considerably greater, from low C to A, with a mean oi 
low G. Some of the men in the class reported, in conversation with Strong, 
that they had been assigned to peisonnel-psychology training without 
being consulted. This is quite different from the typical college situation, 
in which the student is more generally in college for some reason of his 
own, and has something to say about the curriculum in which he studies. 
Even the required courses are then in one sense electives, something 
accepted because they lead to some desired goal, be it no more than 
playing football or being with friends. In the typical college class a stu¬ 
dent can therefore make the grade if lie has the ability, regardless of 
interest in the subject-matter of the course; in the ASTP mam students 
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lacked the motivation to use their ability. In such circumstances one 
would expect to find, as Strong did, that interest has approximately as 
high a correlation with achievement as does aptitude. 

Completion of training is the other criterion of educational achieve¬ 
ment used by some researchers. It avoids the fine distinctions which 
grades attempt to make, stressing the more carefully considered and pe r¬ 
haps more clear-cut distinctions between, i), passing and failing, and, 
2), liking and not liking. I11 Goodfellow’s study (295), lor example', it 
was noted that the education students who changed to other curricula 
made lower scores than did those who remained in education, and Strong 
(775 : 5 2 l) found that only 25 percent of the dental students who rated 
C on his dentist scale giaeluated in from four to six years, whereas 91 
percent of those who rated A, 93 percent of those rating B + , and (>7 
percent, each, of those rating B and B —, graduated. These findings fit 
in with Strong’s explanation of the role of interest in educational achieve¬ 
ment (775:529): 

“If a student has sufficient interest to elect a course*, bis gtade 'will 
depend far more on his intelligence, industry, and previous pieparation 
than on his interest. Intelest affects the situation, however, in causing 
the student to elect what he is interested in and not to elect course's in 
which he is not interested. When a student discovers he has mistakenIv 
elected a comse in which he has little interest, he will finish it about as 
well as other courses but he will not elect further courses ol a similar 
nature.” 

To this should be added, in \iew of the one study in which the range 
of intciest was adequate: When a student is compelled to take* a course 
or to study in a field not ol his own choosing, the relationship between 
interests and achievement will be more nearly comparable to that of 
intelligence and achievement. 

Vocational prefncnee has been frequently demonstrated to have little 1 
long-term reliability or realism in adolescence (c.g., (>13), although in 
college students it has generally proved more stable and realistic (775: 
355). It is often asked, however, whether the scores of inte rest inventories 
provide one with information sufficiently difieient liom expressions of 
vocational preference to justify the time and expense; and, as the prefer¬ 
ences of some groups of college students have proved rather stable, it has 
been suggested that in their case the inventories may be of little value* 
(926). Counselors working with clients in schools, colleges, and guidance 
centers frequently comment on the large number of cases in which 
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Strong’s Blank merely confirms what one already knew from interviewing 
the client and, what is more, what the client already knew himself. It is 
therefore pertinent to inquire concerning the relationship between 
Strong scores and expressed preferences: some relationship would pre¬ 
sumably be evidence of validity, while a nearly perfect relationship 
would suggest substituting a single question for the whole inventory. 

In two investigations (45°,yuj), the conclusion was drawn that scores 
on Stiong’s Blank weic: 1 css useful than expressions of prefeiences. In 
both instances the conclusion was based on low correlations between 
inventory scores and vocational picferences of high school gills, and on 
the tendency of the former to be more concentrated in a few fields than 
the latter. But both studies imolved the Women’s Blank, in which the 
clustering of scores has been seen to be due to the strength of one factor 
which is common in women and in many women’s occupations. 

Bedell (yep found that only two of 17 women’s scales had correlations 
ol more than .50 with the self-estimated interests ol freshmen women. 
Data for 1000 men at the University of Minnesota were analyzed by 
Darley (189:21-25), with a resulting contingency coefficient of . p$ be¬ 
tween claimed \ocational choices and imentoiied interests as determined 
b\ his classification of Strong scores into primary, secondary, tertiary, 
and no interest patterns. An examination of the basic data is pci haps 
more revealing of the inadequacy of expressed preferences as indices of 
measured interests. Scientific choices were indicated by 37 j men, of whom 
onl) 71 had piimary measured interests in the scientific field, 21 } had no 
piimary inteiest patterns, 45 had business detail interests, and the rest 
were scattered among the other fields; 137 claimed linguistic choices, of 
whom 2G had measured primary interest patterns of that type, O5 had 
no priman patterns. 21 had social welfare interest patterns, and the rest 
were scattered; dip claimed business detail preferences, while Go had 
measured primary patterns of that t\pe, Gp had no primarv pattern, iG 
had business contact patterns, and the rest were scattered throughout 
other categories. Allowance must be made for the fact that many had 
secondary patterns in the field of their claimed interests, but even then 
the discrepancies are substantial. Moffie (ryg j) worked with NYA boss 
averaging 18.7 years of age, who rated their interests in the fields assaced 
by Stiong’s Blank and were scored with Strong’s group and specific 
scales: the coirelations ranged from —.07 to .47 and from —.05 to .54, 
respectively. Moflie’s explanation is that lack of maturity and experience 
on the part of adolescents invalidates their judgments of their interest in 



430 APPRAISING VOCATIONAL FITNESS 

different types of woik, while the pattern scores ol an inventory succeed 
in tapping their interests more adequately. It might be suggested, also, 
that this lack of experience and insight is greater in some areas than in 
others. Some occupational fields, c.g. leaching, are more open to observa¬ 
tion by the average youth than others, making easier the formation of 
preferences on the basis of interest, while others such as certified public 
accountant (which has the least leliable scale so far developed) are not 
so readily observed. Great variations in the agreement of ratings and 
imentory scores of individuals were found by Arsenian (30), lurther 
substantiating the hypothesis that maturity and experience, which vary 
from one person to another, account for the difleicnees in agreement 
between measuied interests and preference or choice. Finally, data re¬ 
ported b\ Wrenn (9 p.913) show that the more intelligent college stu¬ 
dents are more likely to “choose” occupations in which they make high 
scores (15 percent of the superior group rate A on chosen occupation. 
3 percent C). while the less able are more likely to make low scores in 
their picferred occupational field (22 percent rate A, 20 percent (!). This 
suggests either that the more able students ha\e more insight into then 
interests than the less able, or that theii supeiior \erbal abilitv enables 
them more adequately to integrate their rationalizations concerning 
interests. Whether or not we are dealing with rationalizations or insights 
ran be ascertained from the extent of the relationship between inventory 
scores and objective criteria such as completion ol training, giades. and 
stability of employment in a field. 

The relative piedicti\e while of inventoried interest and expressed 
preference has been studied only bv Wightwick (92b), who found that ,-j j 
percent ol 115 college women weie emploved in the field ol the freshman 
choice four years after graduation, and 73 percent in the field ol their 
senior choice, in contrast with 58 percent employed in occupations in 
which they had as freshmen made A or B-f ratings. This led the- author 
to the conclusion that measuied interests are not as valid predictms ol 
vocational choice as expressed preferences, conclusion which seems rather 
odd as it can be based only on a comparison of fu'shmcfi inventory scores 
with senior preferences (58 vs. 73); a comparison of freshme n test results 
and freshmen preferences suggests, instead, that inventoties are superior 
to expressed preferences (78 vs. .{j). The greater validity of senior prefer- 
c nees is no doubt due to the nature of the criterion: field entered. It is 
to be expected that senior preferences would reflect an element of realism, 
including considerations ol finances, opportunities, and familv piessures 
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which would make them perhaps less valid indices of interest than test 
scores, but more valid predictors of occupation entered. 

Unfortunately Strong’s nine- and ten-year follow-up studies (775:393- 
I03) have not been analyzed in the same manner as Wightwick’s. They 
do show that about three-fifths of his college students were employed in 
the field of their freshman or senior choice five and ten years alter grad¬ 
uation; they also show a substantial relationship between interest scores 
in college and field of subsequent employment, as seen in the discussion 
of the permanence of interests and as brought out below in the material 
on job satisfaction, but the data are not so organized as to show what 
percentage of men entered and remained in fields in which they made 
A, B-f, or lower scores. 

The relatively low correlation between expressed preferences and 
imentoried interests in high school, the tendency of the less able stu¬ 
dents to pieler fields in which they lack measured interests, and the 
superiority of inventories to the expressed pick-rentes of college fresh¬ 
men in the one known studs which has made such a comparison with 
objee ti\c evidence as a criterion, suggest that inventories can improve the 
(jualit\ of counseling and prediction. With college upperclassmen and 
adults expressed and imentoried interests will probably generally be 
iound to agree, but in some cases insights of this type are lacking, es- 
peciallv when external pressures have been at work on the client. 

Vocational achievement has served as a criterion of the validity of 
Strong’s Blank particularly in work with life insmance salesmen. It might 
be argued that the criterion of the validity of an interest inventon should 
he satisfaction, rather than achievement; certainly satisfaction should he 
one of the outcomes ot interest. But if interest produces satisfaction it 
should also result in achievement, granted the necessary abilities, lor the 
satisfied worker should throw himself more whole-heartedly into his work. 
This might not be true of all occupations, for theoreticallv there might 
be some fields in which the work can be done equallv well regardless ol 
interest and satisfaction in the work, provided the end-result (pay, pres¬ 
tige, etc.) is desired; but in other fields the congeniality of the acti\ities 
engaged in might be important to success. 

That insurance sales is one of these latter is indicated by a number of 
studies by Strong (77a), Bills (90,91,92), Ghisclli (287) and others, most 
of w r hich are summarized in Strong’s book ( 775 : - 1 ^ 7 “ 5 00 )' illustra¬ 

tive data arc therefore considered here. 

Only one of Strong’s studies used as subjects a group ol applicants for 



APPRAISING VOCATIONAL FITNESS 


•132 

employment as insurance salesmen, the otliei gioups consisting o 1 men 
alieadv employed or, in one instance, released, by their company (77r,: 
187-.J88). In die pre-eni])loyment study, the applicants were tested in a 
small agency, i»o were employed, and only if> remained more than three 
and one-half months. The data of the pretested group are there!ore not 
\cry conclusive, although they do show a c lear tendency (r = .,j8) for the 
higher-scoring men to sell more insurance. When data (tom all groups, 
all agencies, were combined the relationship between interest scores and 
sales (criterion reliability — .81) is as shown in Table 50, adapted from 
Stiong (775). 

Table 30 
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those who sold the most insurance: 5b percent o! the A men sold enough 
insurance- to make a living by then-current standaids (.8150,000), as com¬ 
pared with only b percent of the C men. Although the coelhcient ol 
correlation lor 181 of these cases is only .57, the telaiionship is statis¬ 
tically and psychologically significant, for it must be lemembeied that 
most of the- men were tested alter a long period of employment, aftet the 1 
low-pioducing and low-scoring men had been eliminated by natmal 
selection. Lite greater range ol scoies and sales which would chaiacteii/e 
applicants would undoubtedh yield a higher con elation coeflieient. Ii 
these men had all been tested as applicants for emploxment, 01 better 
still as college students, and had made similar scoies, findings would be* 
cpiite convincing. As all but if> of them were tested after they had been 
on the job some time, it is possible, howevci, that some of the 1 poorer 
salesmen indicated liking for fewe r of the sales items than did their 11101c 
successful fellows, not because they actually liked sales woi k less, but 
because they were somewhat dissatisfied with the financial lesults of their 
work. Apparently Strong has not taken this possible* 1 ationali/ation of 
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failure into account, lor he makes no mention ol the possible differences 
between pretested and posttested iespouses. But even il such forces were 
at work, the relationship between inventoried inteiesls and success In 
selling life insurance- is notewoithy. 

File life insurance and real estate salesmen's scales of the Vocational 
Inteiest Blank were combined in Bills’ study of 588 newly employed 
casualty insurance- salesmen and compared with ratings of success alter 
one- year on the job. She- found that 7!) pcicenl of those who made low 
scenes weie failures, while only 22 peicent of the high scoring gioup 
failed. Ghiselli worked with a much smallei group of casualty insuiance 
salesmen, 2cj in all, finding significant relationships for the CPA and 
occupational level scales (.38 and .27). He icports that they tended to 
make high scores on the business contact and detail keys, but that con 
trary to Bills’ findings the contact scales did not correlate with perfoi in¬ 
ane e. As his cases are far fewei in number, the relation can haidlv be 
considered dispioved bv this one study. 

Another t\pe ol salesman, selling detergents on a wholesale basis o\ei 
large tenitoiies and acting as service men on related matters (service 
time c01 related .7,1 with piofits), was investigated by Otis (5.So). The 
gioup was necessaiilv small, as there were lew territories and the* turnover 
rate was low (\ 17). His criterion was selling cost, with which the 

combined hie insurance and real estate salesmen’s scales con elated .50. 
With numbers as small as these the- data are merely suggestive but 
pi omising. 

Accounting-machine salesmen, 1.J3 in number, and 283 service men of 
the same types of machines, were studied bv Ryan and Johnson (btio). 
They found that the two gtoups were diflerentiated from tlu- general 
population bv especially constructed stanchu d-t v pe scales, but that scenes 
on these scales had no relationship to success. They then developed 
another set ol scales based on the differentiation of successful f 10111 un¬ 
successful men in the occupations in cjuestion. These scales did differen¬ 
tiate other gioe.ps of successful and unsuccessful men in the same- jobs, 
the critical ratio for the service- men being pS. 

The lelationships between interests and achievement in several other 
occupations have been summarized bv Strong, olten horn unpublished 
studies; it is from him (775;501 —5 ( »|) that the following are taken, except 
when otherwise indicated. 

Psychologists who were stalled in American Men of Science averaged 
.jS.y on the psychologist ke-v. Strong explains this slightlv below average 
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score on the basis of the low scores made by some applied psychologists, 
two of whom scored below 30 and later went into business , but this ex¬ 
planation seems unnecessary in view of the expected shrinkage in the 
means of new groups when criterion and norm groups are one and the 
same. It can therefore only be said that eminent psychologists taken as 
a group do not seem to differ from somewhat less eminent psychologists 
(the Fellows of the standardization group). 

Teachers were rated by Ullman and by Phillips for success of perform 
ance; the ratings did not correlate with interest inventory scores. 

Engineers rated as outstanding by an engineering dean were compan d 
with full and associate members of the four engineering societies. The 
outstanding engineers made higher scores than the associates. 

Aviators who failed in flying training were not significantly lower on 
the aviator scale than were those who were successful in training, peihaps 
because of the small size of the samples. Another set of pilot scales were 
constructed in a study initiated by the writer in the Air Force (31b: 
f)o8-6n). A total of 650 aviation cadets were tested with the Vocational 
Interest Inventory, and scales were developed on the basis of item va¬ 
lidities, the scale based on cven-numbeicd cases being cross-validated on 
odd-numbered cases, and vice versa. The cot relations between these scales 
and success in primary flying training were insignificant ( — .03 and —. jo). 
confirming what Strong found with smaller groups. 

Advertising men, 36 in all, were rated by three officials of their agency. 
Although the significance of the relationship was not tested, the men 
with higher ratings tended to have higher scores on Strong’s ad\ertising 
scale. 

Foremen, 59 of those employed by a large chemical plant, were rated 
for characteristics which are not described. The correlations between 
ratings and Strong scores were .34 for chemist, .31 for engineer, .25 for 
CPA, and —.31 for life insurance salesman. These relationships are such 
as might be expected in a sub-professional technical job, except that with 
CPA. Thirty others were tested by Schultz and Barnabas (682), and 
were rated for budget-control efficiency and employee relations. The 
correlations between Strong’s scales for production manager and occupa¬ 
tional level, on the one hand, and combined ratings were respectively 
38 and .22. 

Janitor-engineers rated above average in their work (N = 44) were 
found by Berman, Darley, and Paterson to make higher scores on the 
technical and scientific but not on other scales than did a group of 23 
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who were rated below average. In the same study 123 policemen rated 
by their captain were found to be differentiated on the basis ci scales 
which measure interest in social contacts. 

Summarizing the evidence on the relationship between inventoried 
interests and success in an occupation, we have seen that it is significant 
in the case of several quite different types of sales jobs, although in some 
this is so only when success-failure rather than occupational-dillerenccs 
keys are used. Success in psychology and in teaching were not related to 
the degree of similarity of interests to those of persons employed in those 
fields, but success in advertising, technical foremanship, janitorial work, 
and police were. Successful and unsuccessful aviators were not differen¬ 
tiated by success-failure scales. 

The sales data are consistent with the writer’s hypothesis concerning 
interest and achievement, for selling life insurance requires a substantial 
degree of self-direction and willingness to persist in the face ol a cool 
welcome; piesumably only a person who finds a real challenge in locating 
piospects and in making himself pleasant and helpful to them could 
make enough calls to earn a living. Congeniality of the work is impoi- 
taut, and theie is a significant relationship between interest and achieve¬ 
ment. 'File same is true of casualty insurance salesmen, and of wholesale 
salesmen in whose work service to customers is an important function. 
But in somewhat more loutine sales work interest is related to success 
only when the interests of successful men in the occupation are contrasted 
with those of failures in the same field rather than with those of men-in- 
general. 

The other findings arc more difficult to synthesize or rationalize. The 
apparent contradictions may lie in the differences in the criteria of suc¬ 
cess: being starred in American Men of Science for one’s research contii- 
hutions is not comparable to being rated highly for the successful 
management of advertising accounts. Perhaps achertising is puitly .1 
sales occupation (r life insurance salesman — .59), in which case the 
importance of interest is explainable in the same terms. Psychology and 
teaching are non-competitive, and success in both fields can be achieved 
in a great variety of ways; perhaps congeniality is less crucial to them 
because of their varied outlets. Just why interest should so clear h pla) 
a part in success in non-competitive fields such as foremanship, janitorial 
work, and police is difficult to see. Congeniality could be important, in 
that all three groups have to put up with the vagaries of a windy of 
people, but then so do teachers. More studies, with more detailed analyses 
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of the duties of those involved, are needed before the significance of these 
findings will be dear. 

Occupational differentiation being the basis on which the Strong Vo¬ 
cational Interest Blank was constructed, most of what might be covered 
in this section has already been dealt with in earlier sections, particu¬ 
larly those concerned with the construction of the occupational interest 
scales. Some occupations have been studied without scales having been 
developed lor them: lor example, Bluett (102) ascertained the patterns 
of interest stores character i/ing vocational rehabilitation officers. The 
applicability of the adult scales to pre-occupational groups was verified 
by Goodman (29b), who found that engineeiing students diflered in the 
expected ways from liberal aits students, and by Banett (95), who lound 
that women college students majoring in art made* higher stoics on the 
artist scale than did other students. But the most significant problem still 
to be discussed is that of the differentiation of women’s vocational gioups 
on the basis of their interests. 

Women’s and gills’ intciests ha\e been investigated with Stiong’s 
Blank b\ Lalegcr (150), Skodak and Oissey (719), Crissy and Daniel 
(182), and others besides Strong himse lf (775: 1 (>2-1(18). These studies 
ha\e shown that it is moie difficult to differentiate women on the basis 
of their interests than it is men. The manual lor the Women’s Blank 
shows a sm pi isingly large numbe r of substantial correlations between 
occupations which would not, on the basis of data for the men’s form, 
be expected. I he correlation between the women’s office woikei and 
nurse scales, for example, is .55, while that between office worker and 
housewife is .84. ft has frequently been noted that populations of high 
school girls and college gills tend to make far moie high scenes as muse, 
office woikcr, elementary school teacher, and housewife than should be 
found m a random sample. Stuit (784) found this e\cn among (cachets’ 
college students. A suggestion as to why this may be* the* case 1 cmciges 
w r hen it is noted that the correlations between the 1 housewile scale, on 
the one hand, and those for nurse, phvsical education teacher, eleme ntary 
school teacher, office* worker, and stenographer on the* other arc* respcc- 
tively .59, .56, .84, .77, .80. 

The factor analysis by Crissy and Daniel (182) referred to earlier 
carries this thought further. They found four factors in women’s voca¬ 
tional interests, three of which were like those found by other psycholo¬ 
gists in studying men, but one of which they called “male association,’’ 
thereby bringing down on their heads a storm of protest from women 
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psychologists. It is this factor which others have called interest in muE 
tiplicity of detail, interest in the convenience of others, interest in order, 
and non-professional interests. It has a very slight loading in the mascir 
linity-feniininity scale. Whatever the factor is, it seems to be present in 
a great many wome n, especially in those in the occupations named, and 
it is present in negative iorm in other women, particularly those who 
make high scores as authors, librarians, aitists, physicians, and social 
workers. It is woithy of note that the occupations in which the so-called 
male association lac tor is important in a positive way are those which 
may be entered alter a relatively brief and easily obtainable education, 
whereas those in which it is of negative importance aie by and huge those 
which lecjtiire a longer and less easily obtained education or which ate 
entered only by the persistent and highly motivated. It would be helpful 
to have the marriage rates in each of these occupations, in order to as¬ 
certain whether or not those who are characterized by a strong “male 
association” factor do in lact marry in greater numbers. Observation 
suggests that the loss of women office workeis through marriage is greater 
than the loss of wome n authors, physicians, and social woi kers lor the 
same reason, but this is no doubt partly because the latter groups fie- 
quently continue the ir woi k eve n after marriage. If both gioups of 
occupations marrv with moie or less ecjual frecpiency this factor can 
hardly be named “male association”; and if it really is that, whv is there 
no evidence of a “female association” factor in men, most of whom also 
marrv? As the' factor has beem isolated only in women, is positivclv 
related to stopgap and negatively related to career occupations, and is 
more important in the occupation of housewife than in am other (factor 
loading .8g), it is suggested that this is in reality a home-vs.-career lactot. 
The home or career decision is one which many women have to make, 
and which mast decide in favor of the home. It is presumably the pres¬ 
ence of this factor which makes it ciiflicillt to measure the vocational 
interests of women with Strong's technique, lor it outweighs vocational 
interests in many instances. As will be seen in connection with the Kucka 
Preference Record this difficulty is not an insurmountable obstacle to 
the measurement of women’s interests, but to overcome it it is necessary 
to use a different type of inventorying device. 

Satisfaction in one’s work has seemed to most psychologists and coun¬ 
sel 01 s to be the objective of counseling or employment on the basis of 
interest inventories. But the appraisal of vocational satisfaction is not 
a simple matter, for a multiplicity of factors are involved and not all 
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of them arc easily accessible. The criteria of vocational satislaction in 
studies of interests have consisted of stability in the occupation (in con¬ 
trast to the position), and expressions of satisfaction or dissatisfaction 
by the worker. 

Occupational stability was the criterion favored by Strong (775:384- 
588) and used in his follow-up studies. It is reasoned that interest deter¬ 
mines the direction of effort, ability the level of achievement. The 
criterion of a vocational interest inventory should therefore be the extent 
to which it predicts the direction of effort. College students who enter 
an occupation and remain in it for ten years after graduating from col¬ 
lege are presumed to be interested in and satisfied with the direction 
of their efforts, even though a lew are known to persist because of family 
or economic reasons. Those who change from one field to another are 
presumed to do so because they find the first field of activity unsatisfac¬ 
tory, and expect that the second will piove more so, despite the fact that 
some individuals change fields of work for economic reasons. If these 
assumptions can fie granted, and they probably can in higher-level eco¬ 
nomic groups such as graduates of a private university, then occupational 
stability is a good index of vocational satisfaction and a suitable ciiteiion 
of the validity of a vocational interest inventory. 

The ten-vear follow-up (775:393) consisted of 287 Stanford Univeisity 
seniors tested in 1927 and followed up in 1928, of whom 223 were re¬ 
tested in 1932, and 197 again retested in 1937. The nine-year follow-up 
was based on 306 Stanford freshmen tested in 1930, of whom 174 were 
retested in 1939. The principal findings and conclusions are as follows: 

1. Men continuing in an occupation for 5 or 10 years a ft cm college 
made higher scores in it than in other occupations (mean standard 
score 50.2 vs. - 17 - 7 )- 

2. They tended to make higher scores in that occupation than did 
other men (data too complex to repioduce here): 

3. Thev made higher scores in that occupation than did men who 
changed from that occupation to some other (standard score 48.0 
vs. 

4. Men changing from one occupation to another alter employment in 
the first field did not make higher scores on the latter occupation 
when in college, but their average scores were substantially lower 
in both the first and the second occupation than weie those of men 
in groups 1, 2, and 3, above (standard scores 42.4 and 40.5), which 
suggests that those who change occupations have less clearly defined 
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interests, or less insight into them, than do those who lemain in 
the occupation ol their first post-college choice. 

The hypothesis that interest inventory scores are manifestations of 
stereotypes docs not seem to be sufficient to explain away these findings. 
It could, if true, remove the significance of the first finding, since an 
unchanging stereotype would be the result of staying in the same occupa¬ 
tion; it could do the same for the second finding, for the men who entci 
a given occupation would be expected to have the relevant stereotype 
to a higher degree than others; and it could be argued that the men in 
group three who changed to other fields did so because they found that 
their concept of the occupation and of their role in it did not coincide 
with the facts; but the fourth finding and conclusion imply that an 
interest pattern is the result of more than a mere stereotype or e\en a 
moie deep-seated self-concept, but rather the product of a more funda¬ 
mental combination of personality tiaits, aptitudes, and modifying ex¬ 
periences. This fourth group apparently lacked the highly organized 
pe rsonalities (in the bioudest sense) which chaiacteri/ed the other gioups. 
as indicated by the mean standard scenes, even alter several \ears of 
occupational experience in which they might have acquired the stereo 
type. 'The lack must have been one of aptitudes, temperament, and 
values. 

Women were followed up eight years after testing and four years after 
graduation from college by Wightwick (926). Of her 115 subjects, 58 per¬ 
cent were employed in occupations in which they had made A 01 B + 
ratings, while 77 percent were in fields in which they had at least ternary 
patterns. The data were not analyzed for stability of employment in 
the same wav as in Stiong’s study, but in 1941, 43 percent were employed 
in occupations in which they had made A or B-b scores in 1933. 

These findings seem to be confn med by trends brought out in a study 
of 76 adult men by Saibin and Anderson (667), in which the client’s 
statement of vocational satisfaction or dissatisfaction was related to his 
primary interest pattern. They found that 82 percent of the men uho 
expressed dissatisfaction with their current occupations did not have 
primary interest patterns in the fields in which they were employed, but 
there is no indication as to how many satisfied workers possessed primary 
interest patterns in the field of their endeavor. If their data arc lecom- 
puted to permit another comparison, it appears that 57 percent of those 
who had a primary interest pattern in the field of employment were dis¬ 
satisfied, as compared with 52 percent of those who lacked the appro- 
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priatc primary interest pattern. Ill is would seem a strange finding, were 
it not that the subjects were clients of an adult guidance center, and 
therefore were, as might be expected, a predominantly dissatisfied group. 
Although Sarbin and Anderson’s statement that “adults who complain of 
occupational dissatisfaction show, in general, measured interest patterns 
which are not congruent with their present or modal occupation” 
(667:35) is exactly what one would expect to find, it can hardly be said 
that they lia\e demonstrated the truth of the statement. 

Satisfaction in a professional curriculum was correlated with inven¬ 
toried interest in that field by Berdie (77), in a study of 154 engineering 
sophomoies who had been tested as freshmen. Satisfaction was measured 
by a modification of Hoppock’s fob Satisfaction Blank, in which the 
term “curriculum” was substituted for “job” and “occupation.” The 
correlation between scores on Strong’s engineer scale and satisfaction 
score was .10, too low to be significant. When the data for 43 men whose 
blanks had been scored for all occupations were subjected to analysis of 
variance, it was found that those with no interest pattern in the engi¬ 
neering field were significantly Jess satisfied than those with a primary 
secondary or tertiary pattern in the physical sciences. The numbers were 
so small, however, as to make conclusions highly tentative. 

Although the evidence concerninng interest and job satisfaction which 
consists of occupational stability data is impressive, theie is a need for 
fin ther studies using clinical and psychometric indices of vocational 
satisfaction. Sarbin and Anderson’s study was a step in this direction, 
as was Wightwic k’s, but an adequate investigation of clinically or psy- 
chonictrically dctettnined vocational satisfaction in iclationship to in¬ 
ventoried interests has yet to be made. 

Use of Si long's Vocational Interest Blank in Counseling and Selection. 
The findings of research which have been reviewed in the preceding 
sections have* shown that interest is not a completely independent entity, 
but rather something which is re lated to general ability, special aptitudes, 
and values in various ways. Linguistic and scientific inteicsts ate posi¬ 
tively correlated with intelligence, technical interests are related to me¬ 
chanical aptitude, and business interests are related to the tendency to 
stress material as opposed 10 theoretical, social, or aesthetic: values, to 
cite just a fcw r of these relationships. But the very complexity of these 
relationships supports the hypothesis that interests are sufficiently uniejue 
to warrant special consideration in the study of an individual or a group, 
and other evidence shows that they have significance in and of them¬ 
selves which makes their study important. It seems likely that aptitudes, 
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values, and perhaps temperament are fundamental factors which, to¬ 
gether with experiences in childhood, determine the development and 
nature of interests, but the end result is a type of individual differences 
which take on a character of their own. There seems to be something- 
magnetic about interests, pulling people in their direction and holding 
them in place once there. 

The development of interests has been seen to be well under way by 
adolescence, lor by age 14 or 15 the interest patterns ol boys and girls 
ha\e begun to take forms similar to those of adults, and these patterns are 
generally modified by increasing maturity by becoming more clear cut, 
and by a tendency, in boys at least, toward great socialization of interests. 
By the time boys and gills are from 18 to 20 years of age their interests 
are fairly well c t vsialli/cd, and in most cases change very little thereafter. 

The oc c upahons lor which Strong’s imentory has been validated are 
primarily professional, managerial, and clerical, although a lew skilled 
occupations aie included among the scales. Its usefulness is therefore 
piimarih with those persons whose intellectual and educational level is 
high enough to provide a sound basis lor aspiration to the middle or 
upper hall of the occupational ladder. The men’s lorni can be scored lor 
about 40 occupations, the- women’s lor more than 20; while these seem 
like vers lew, compared to the large number of jobs which have been 
differentiated in other ways, the limitations of the instruments ate not 
as great as this suggests. The occupations are more; broadly defined than 
in the Dictionary of Occupational Titles (888), for example, and what 
is more*, the intercor relations have shown that they fall into interest 
families, that these occupations can be grouped according to common 
underlving interests. This means that by using this imentory and scoring 
it for a relatively small number of occupations one can tap interest in 
a few core fields in which most known occupations could probably be 
placed. It is important to bear in mind, however, that interest is not 
necessarily a predictor ol success, even when needed abilities are present, 
for interest seems to be related to success only when the congeniality of 
the activities in question affects application, and when the effects of 
application arc readily determined, as in competitive work such as sales. 
It seems to be much more likely to be important to satisfaction and 
stability in a held than to quantitatively judged success. The women’s 
lorm is not as satisfactory as the men’s, because of the commonness of 
one interest fac tor in women; it is only in the cases of those with c h\n-cut 
career interests that it is likely to prove valuable. 

In school and college the Vocational Interest Blank is sufficiently well 
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understood by ioth graders to be used with them, and their matmity is 
great enough to gi\e their scores meaning despite the fact that there is 
some subsequent modification of interests. The occupational interest 
scores have value for the prediction of educational achievement when 
screening on an ability basis has already taken place, and when there has 
been no screening on the basis of interest. In most situations, however, the 
choice of curricula or courses by students gives them enough of an elec¬ 
tive character to nullify the relationship between interests and grades. 
Completion of a sequence of courses or ol professional training is, how¬ 
ever, related to interests as measured by the Strong Blank, for those whose 
interests are unlike those of people in the same occupational field tend 
to drop out more frequently than do students with appropriate interests. 
The inventoried interests of high school students are of more value in 
vocational diagnosis than are then expressed preferences; on the other 
hand, the preferences of college students arc likely to be mature enough 
to warrant more serious consideration, and are likely to be not much 
less significant in freshmen, and slightly more significant in seniors, than 
measured interests. The younger and less able the boy or girl, the more 
need there is for a good interest inventor) ; Strong’s Blank seems to meet 
this need within the normal ranges of male high school juniors and 
seniors, college students, and adults; it does so less well for girls and 
women, because of the carecr-vs.-home factor. The older and brighter the 
individual, the less likelihood there is that Strong’s Blank will reveal 
anything new to the subject, although the confirmation of interests is 
often very helpful and new light is sometimes thrown on confused or 
poorly understood situations. 

The counseling use of Strong’s Blank in school and college can there¬ 
fore be both for choice of curriculum and for choice of occupational field. 
Students may be encouraged to major in fields in which they have 
primary interest patterns, with the knowledge that they are more likely 
to complete work in those fields than in those in which their interests 
are not so strong. Their choice of occupations for which they have 
appropriate measured interests may be viewed with more confidence that 
they will still prefer those fields after five or ten years of employment in 
them. Despite the possibility of faking scores in an attempt to impress, 
the inventory has similar value in student selection programs. 

In working with high school and college students one not infrequently 
encounters cases in which there seems to be no primary interest pattern. 
As these are usually students or clients who have no clearly defined ex- 
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pressed preferences and who hope that the interest inventory will dis¬ 
cover some hidden interest, this experience is one which is especially 
frustrating to novice counselors. The frequency of such cases in a college 
population has been investigated by Darley (189:19-21, Ch. 5), who 
found that slightly more than 52 percent of 1000 University of Minnesota 
students had no primary (A and B-b) interest patterns, while 16 percent 
had only tertiary (B and B—), and 3 percent had no distinguishable, 
interest patterns. Darley set up the hypothesis that students with high 
interest maturity and no primary interest pattern would make poorer 
grades in college than students who had primary interest patterns, but 
this was not verified by his evidence. As he puts it, “the case with no 
primary pattern will continue to be clinically difficult for the counselor 
. . . as usual, more and better research is necessary . . Strong showed 
that the interests of business students are less clear cut than those of 
professional students (775:420); he suggests that people with widespread 
interests and often without primary interests should consider business, 
particularly if they have secondary interests in the business groups (775: 
430); but he, like Darley, ends by giving up: “These are the hardest of all 
people to counsel, because they have so little to contribute and either 
they have a lot of half-baked plans that change from interview to inter¬ 
view or they sit back and expect the counselor to prescribe the remedy" 
(775:441). In the writer’s experience with college students it also seemed 
that the undifferentiated students were those who entered business, for 
lack of something more challenging. He is somew hat reluctant to let the 
matter rest there, however, in view of Strong’s findings concerning the 
differentiation of people at lower occupational levels wdien a different 
point of reference is used. Research in the “undifferentiated” group both 
in college and elsewhere should presumably be pressed, using other points 
of reference than that of the standard scales. 

In guidance centers the counseling use of Strong’s inventory is similar 
to that in schools, with the exception that there it is often given to entire 
classes as a part of a routine testing program, whereas in a guidance 
center it is part of a tailor-made battery individually administered. In 
mass testing which has been properly motivated the examinee’s answers 
are likely to be frank and free, for even though motivated to co-operate 
he is likely to feel that he has relatively little at stake. In the individual 
testing program there is more liklihood of self-scrutiny and of uncon¬ 
scious warping of responses to make them congruent with an acceptable 
self-concept. In the former case scores may not be as high, but they reveal 
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the patterning of specific interests more truly; in the latter case, they are 
more indicative of self-concepts. Both types of data have their value, 
provided the counselor knows with what it is he may be working. 

A guidance center has an advantage over employment services and 
departments in the use of inventories such as this, in that its functions 
arc recognized as being more advisory than administrative, the former 
role being one which encourages irankness on the part of examinees. 
Despite this fact, consultants making evaluations need to be alert for case 
history material which tends to support or to contradict the evidence of 
the inventory. It might be well if two inventoiies, known to differ in their 
transparency, were used, to provide an index of tendency and of direction 
of distortion of interest scores. The icsearch necessary to the development 
of such an index has not been carried out as yet, but the germ of the idea 
is to be found in a paper by Paterson (586). 

In employment services inventories such as this are rarely used, as the 
type of counseling offered there has generally to do with employment 
rather than with choice of a field of work, and the interests of employment 
applicants have generally seemed assessable by less complex methods. As 
more attention is paid to the needs of inexperienced youth, on the one 
hand, and to the careful apptaisal of adults applying for competitive 
jobs, on the other, interest inventories should probably find more use in 
employment services. 

In business and industry the use of Strong’s Blank has been confined 
to the selection testing of applicants for sales positions, particularly those 
in which the importance of the congeniality of the work, the independence 
of the salesmen, the intangibility of the item sold, or the competitive 
nature of the selling have been notable. These items include life insur¬ 
ance, casualty insurance, real estate, business machines, and vacuum 
cleaners. As work with this type of instrument began in an attempt to 
distinguish sales engineers from technical engineers one would expect to 
see other successful applications made as time goes on. Here, more even 
than in guidance centers, the possibility of faking unduly high scores 
needs to be considered. The indications are that despite this tendency 
the Blank is a useful sales selection instrument; an index of distortion, 
such as that suggested above would make it even more so. 
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The Kuder Prefei eric e Record (Science Research Associates, 1939, 1943, 
and Short Indusuial Form 19)8) 

WORK with this inventory was initiated by Kuder at Ohio State Univer¬ 
sity early in the 1930^, leading to the publication of the inventory in 
1939. Three forms were tried out during this experimental period. After 
the 1939 edition had been in use for several years it seemed desirable to 
cover mechanical and clerical activities more adequately, and the second 
edition was developed and published, incorporating also a change in the 
form of the items. A shot t I01111 for use in business and industry was 
published in 1948. Publication of the inventory was welcomed by many 
counselors in schools, colleges, and guidance centers, because it was more 
economical to score than Strong's Vocational Interest Blank, then practi¬ 
cally the only inventorv which had been well validated, and because it 
also showed signs of having been subjected to a good deal of research. 
Furthermore, its format and mat king device had an immediate appeal 
to students taking it. Users of vocational tests therefore often included 
the Preference Record in their batteries, interpreting its results in very 
much the same terms as those of Strong’s Blank, simply on the basis of 
the general similarity of the types of items and scores, which seemed like 
those of Strong’s group scales. Today the Kuder is one of the most widely 
used vocational tests and inventories, and additional evidence concerning 
the nature and vocational significance of the traits it measures is pub¬ 
lished practically every month in the professional journals. 

Applicability. The Kuder Record was designed for use with high 
school and college students, and with adult men and women. The items 
were so written as to be applicable to both sexes, the vocabulary was kept 
as nearly as possible at the high school level, and the content seems to 
have been selected for its familiarity to adolescents as well as to adults. 
T wo reports on the suitability of the inventory for high school students 
have been published. Christensen (157) tried it out on 27 9th graders and 
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ascertained that many of the items were not understood; when the class 
was instructed in the meaning of the items and retested, the scores changed 
appreciably. The reading difficulty of the Kuder was checked by Stefflrc 
(752), who used the Lewerenz formula for vocabulary grade placement; 
he found that the vocabulary difficulty grade level was 8.4, and that it is 
easier than that of the Strong (10.4), the Allport-Vernon Study of Values 
(11.3), and the Clecton (12.0), but somewhat more difficult than that of 
the Lee-Thorpe (6.8) and Brainard (6.4). 'These findings suggest that the 
Kuder can be administered to typical 8th grade boys and girls, although 
the less able will have difficulty with some items; its use at the gth of 10th 
grade levels is likely to prove satisfactory in this respect. Norms are avail¬ 
able for the interpretation of the inventory with high school students 
and adults. 

The transparency of the items in the Kuder, or the ease with which 
faking and unconscious distortion of responses can take place, has seemed 
a problem to many users. That their objections have some basis in fact 
is suggested by the nature of the items, as inspection reveals them to ha\e 
rather obvious vocational implications. Both the Kuder and the Strong 
inventories were administered to a clerical employee being considered for 
transfer and promotion to a desired personnel position by Paterson (586), 
who compared the man’s responses on both forms. The data suggested 
that the employee’s interests were truly clerical, that he wanted to appear 
in the best possible light as a potential personnel appointee, that his scores 
were distorted in the direction of personnel interests by this fact, and that 
the Kuder was more affected by distortion than the Strong. As these 
are merely observations of one case they are not conclusive, but they do 
seem to confirm the general opinion of uscis of vocational interest inven¬ 
tories. Two experiments designed to test the transparency of the two 
measures, as Strong tested that of his own, have been completed. Bor din 
(112) has reported one such, in which it was found that the professed 
social service and literary interests of college students were more highly 
correlated with Kuder than with Strong scores (e.g., r = .43 vs. r = .29). 
suggesting greater transparency in the Kuder, but the trend was not 
consistent for other scales. Cross, in an unpublished study of high-scoring 
students (181 males, 183 females), found clear-cut evidence of ability to 
lower and to raise Kuder scores according to directions. 

Another aspect of this question of the meaning of inverrtory items and 
the orientation of respondents was investigated by Piotrowski (608), who 
tested 18 superior students in a school of social work with the Kuder and 
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the Rorschach. All subjects scored high on the social service scale, but 
psychiatric interviews led to the conclusion that only 11 of the inventory 
scores were “valid,” while 7 were “invalid”; in other words, 7 of the social 
workers were not genuinely interested in social welfare, but made high 
scores because of conscious or unconscious distortion. The Rorschach 
responses of the two groups were then compared, with the conclusion 
that those who really had social service interests (as confirmed by interview 
data) were closer to reality, had a wider range of psychological experiences, 
were more realistic in their aspirations, more interested in people foi 
their own sakes, 11101 c self-confident, and less frequently subject to de¬ 
spondent moods. While the lesults for other preoccupational or occupa¬ 
tional gioups might reveal fewer invalid scores (assuming the validity of 
the psychiattic interview) than a field such as social work, the evidence 
does indicate that distortion of scores on the Kuder can seriously affect 
the results. 

There is, finally, the question of changes in responses to this type of 
inventory with increasing age. Although one would expect Strong’s 
findings to hold for interests however measured, there is the possibility 
that the form of the question and the method of scoring affect findings in 
the case of a particular instrument, making such generalizations unsafe 
until appropiiate evidence is adduced. Retest reliabilities after a lapse 
of 15 months were computed for 16 adult subjects (ages unreported) by 
Traxlcr and McCall (St>S) , who lound that they tanged from .61 for social 
service interests to .93 for musical interests, the median being .83. This 
suggests a considerable degree of stability of responses. DiMichael and 
Dabelstein, in an unpublished paper (200) found reliabilities ranging 
from .70 to .89. Even for the least reliable scale letter ratings (A = 75 per¬ 
centile or above) changed in only 9 percent of the cases. Traxler and 
M(Call, and Ruder in his manual, provide data showing that the changes 
which take place during senior high school and college years are relatively 
slight, making unnecessaiy the use of special norms for each high school 
grade. This conclusion cannot be compared precisely with Strong’s, as his 
norming procedure was different, but it docs appear to be at variance 
with it. Strong’s and Carter’s work, reviewed elsewhere, showed more con¬ 
vincingly that certain changes do take place in the interests of adolescents 
and that they are fairly well crystallized by the end, rather than by the 
beginning, of the high school years; they have merely begun to take shape 
by age 14 cm 15. Until intensive work on age changes has been carried out 
with the Kuder, it seems wise to assume that some changes such as those 



148 APPRAISING VOCATIONAL FITNESS 

known to take place in responses to Strong’s Blank also aflect Ruder 

scores. 

Content. The Piefeience Record consists of preference items arranged 
in triads. Item four ill list 1 a tes the principle: 

Build bird houses 
Write articles about birds 
Diaw sketches ot birds 

The examinee decides which of these three activities he likes best and 
marks it to show his lust choice; then he decides which he likes least, and 
marks it to show his thiicl choice. The activities in each item are so writ¬ 
ten as to tap three 01 mote dilleient txpes ol interest, in this case median- 
ical, literary, and artistic. 'There are 50j such items (standard form), 
assessing interest in a total of nine 1 different types ol interests. 

Administration and Scoring. 'There is no time limit, as there are no 
right or wiong answers; the time requited by high school students is from 
thirty minutes to one hour, by college students apptoximately lorty min¬ 
utes. It is nccessarx to make sine that the diiections lor using the- response 
pins are correctlv lollowed, hut, as examinees are usually intrigued In the- 
mechanics of the imentorv, motixating them to lollow directions is rela¬ 
tive!}’ easy. Scoring may be done* bv hand, using appropiiate answer 
sheets and a j)iu to prick answers, in which case the common proceduie 
is to have examinee's do the scoring themselves. Idle directions are cleai, 
and it takes about fifteen minutes to obtain all nine scenes. Prolde sheets 
are provided on which examinees concert their scores to percentiles and 
plot them graphically. 'This method has generally been found to be a good 
device for getting pupils interested in their scores and to proxidc a spring¬ 
board for discussion of vocational interests. Machine scoring is also 
possible, with the use of special answer sheets. The scores obtained are 
for mechanical, computational, scientific, persuasive, artistic, literaly, 
musical, social service, and c 1 e 1 ical interests. 

Norms. The 19 jb edition of the manual contains norms for thre e dif¬ 
ferent base groups. The first consists of approximately 2000 boys and 
2000 gills, in grades 10, 1 j, and 12, the three grades being lumped to¬ 
gether because of the lack of important grade differences but the sexes 
separated because sex differences are significant. The second is made up 
of adults engaged in a variety of occupations; 26G7 men from 4} occupa¬ 
tions, and 1429 women in 29 occupations, again treated separately because 
of sex differences. Thiidly, there are norms for college students, those 

1 A tenth, “Outdooi,” interest scale has been added. 
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lor women based on 1263 students in various curricula, while those for 
men are, lor the time being, derived from groups of about 200 each from 
several different colleges. The profile sheets provided with the answer 
sheets ate based on the litst two groups, and Kuder expects to piovide one 
lor (lie third group. While these norm gtoups are helpful in providing a 
backdrop against which to view the interests of an individual, their com¬ 
position is not as vital a question as in the case of Strong’s Blank, for with 
the* Kuder one studies the relative strength of each of nine different 
inletesis within an individual, whereas in the Strong the comparisons 
are basically between groups of individuals classified by occupations. 

Having obtained a ptofile erf scores which shows the relative strength of 
the different t\pes of interests in the person being examined, the next 
question which arises is that of the occupational significance of the profile. 
It was the absentee of occupational norms which made many users of 
vocational tests hesitate- to use Kuder’s inventory, despite the care with 
which it was constructed and the economy with which it could be used. 
It was not until .liter Wot Id W ar II, for example, that the writer used 
it in counseling on anything other than an experimental basis, just be¬ 
cause it did not seem sufficient to know that a client was more interested 
in mechanical activities than in any other type, when what counts in 
vocational adjustment is how his interests compare with those of persons 
who have succeeded in the field. This point has effectively been made bv 
Diamond (lpcja), in an important study of the occupational significance 
of Kuder percentile scoic-s. 

The iq.jf> manual has to some extent made good this deficiency bv 
providing norms for 4 ] men’s occupations and 29 women’s, supplemented 
by curricular norms for women college students in 24 different fields. 
The numbers in anv one group are small, ranging from 16 men English 
teachers and if) women language teachers to 185 male meteorologists. 
Strong’s woik suggests rather dearly that these numbers are too small to 
be- reliable, but they are better than no data at all, and an unpublished 
study by Triggs has shown that for one group, at least, the adding of 
additional cases makes no difference. She tested 82(3 nurses, and found 
that their mean and sigma differed little from those for Kuder’s group of 
183. As the manual indicates that the test-author is interested in receiving 
additional occupational data for norming purposes, it may be assumed 
that better occupational norms will become available in due course. Judg¬ 
ing by the incomplete evidence in the manual, there is one other possible 
defect in the occupational norms: that of sampling. This problem has 
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been anipjy discussed in connection with Strong’s work, so it need only 
be pointed out here that better evidence needs to be supplied concerning 
the type of employment, skill levels, degree of permanence, level of at' 
tainment, and regional location of the representatives of any given 
occupation. The manual does not mention these variables. 

The occupational norms consist of the means and standard deviations 
of each occupational group on each interest scale, and graphic profiles 
based on these same means. The profiles permit a more rapid inspection 
of the data than do means, and enable the counselor to compare quickly 
his client’s profile with that of the various occupational groups. As the 
work of the Minnesota Employment Stabilization Research Institute (223) 
and the United States Employment Service (225) has demonstrated, how¬ 
ever, this technique has serious defects. Not only is it impressionistic 
rather than exact, but the aiterion upon which judgment is based is 
unsound, for the counselee is compared with the average person in the 
occupation lather than with the marginal worker. To put it concretely, 
if the counselee is significantly below the mean of the occupational group 
at two points of the profile, and significantly higher at two other points, 
does that mean that the choice of that field would be unwise? It would 
be more helpful to know the critical scores for each trait being measured, 
for then a “low” score would be known to indicate a c 1 itical lack ol a 
trait which has been found to be related to success or satisfaction in the 
occupation in question. This is the procedure now used by the United 
States Employment Service in its General Aptitude Test Battery (225; 
see also pp. 358 fh). Diamond’s data (199a) are again highly relevant. 

To provide a less impressionistic method of compaiing individual pro¬ 
files with those of persons established in various occupations. Ruder has 
developed occupational indices which are a statistical summation of the 
similaiity of the examinee’s interest profile to that of the occupation in 
question. The principle is similar to that used by Strong, although Strong 
applied it to a series of items whereas Ruder applied it to scores on a 
series of scales. Only one occupational index has so far been published, 
that for accountant-auditor (q.j6). Triggs has also developed indices for 
nurses in several specialities, described in an unpublished paper. As more 
of these indices are published the value of the Ruder in vocational coun¬ 
seling will increase, but the counselor will be called upon to exercise a 
high degree of judgment in deciding when a deviation from the mean is 
so great as to suggest the abandonment of an objective by the client. 
Standardization and Initial Validation. Many users of the Ruder Prel- 
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cionce Record have been puzzled by the method of weighting the items 
in the inventory. The mental set established by Strong's work led them 
to believe that Kuder’s interest scales were occupational in nature, that 
his scientific scale, for example, was the scale for a scientific family of 
occupations, but the early manuals and published studies showed no 
evidence of occupational standardization. The alternative explanation 
seemed to he that the keys were based on a factorial analysis of interests, 
such as were made with Strong’s data, but again there was no evidence of 
such work. Lacking any such empirical basis, the scales were not infre¬ 
quently suspected of being the product of nothing more than a prion 
reasoning. 

Succeeding editions of the manual have attempted to make clear 
exactly how the scales were developed, but the writer has talked with 
competent applied psychologists who had still not grasped the procedure, 
simple though it is. The first step was the construction of a priori scoring 
keys, in one of which all seemingly literary items were scored, in another 
all sc ientific. and so on. The second step was to score the blanks of several 
hunched persons with these scales. The thiid step was to make an item 
analysis, to ascei tain the* internal consistency of these scales. If it was found 
that those poisons who had made high a priori literary scores tended to 
choose a given item 11101 e often that those who had made low scores, the 
item was ictained in the literary scale; if it was not so chosen, it was 
discarded. Alter this piocedure had been applied to all the a priori keys 
it was found that some ol the empirically purified scales (the seven pub¬ 
lished with the first edition) were internally consistent, independent of 
each other, and reliable, while others (athcletic, religious, and social- 
prestige interests) were not internally consistent or independent—they 
were, in fact, purified out of existence by the item analysis (the social- 
prestige scale actually split in two). The item analysis therefore gave an 
empirical basis for stating that the interest scales measure something, 
and that these entities are independent of each other and unchanging in 
their composition. The method of naming traits is then comparable to 
that in factor analysis, and depends on inspection of the items and judg¬ 
ment as to their nature. The names given by Ruder seem warranted, as 
might be anticipated when the items are rather transparent. The two 
scales added with the second edition were for mechanical and clerical 
interests, and were based only on internal consistency; they are corre¬ 
lated somewhat more highly with the scientific and computational scales. 

The intercorrelations of the original seven scales range from — •34 f ° r 
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the scientific and persuasive scales to .19 for the scientific and computa¬ 
tional scales, when based upon 22(17 adult men in a variety of occupa¬ 
tions (44(1). The somewhat higher intercorrclations for the new stales are 
.50 for the clerical and computational, and .405 for the mechanical and 
scientific scales. 

Reliability. The reliability of the Ruder scales has been ascertained 
for several diflerent age groups and summarized in the manual by kudei. 
For 8th-gracle students the Ruder-Richaidson reliability coefficients 
range from .8] to .96 (100 boys and girls); lor 129 high-school senior boys 
they lange horn .87 to .94; lor a similar number of senior girls they weie 
.80 to .94; for 900 employed men, .88 to .95. One study imohing retest 
1 eliabilities (862) showed even higher reliabilities lor jy graduate students, 
ranging fiom .99 to .98. These high reliabilities may be the result of 
item-transparency and the stability of sell-concepts more than ol the 
adetjuacy ol the inventory; Piotrowski’s study, mentioned eailier, might 
be taken as lending support to this interptetation. But, whatever it is the 
Kudcr measures, it measures it reliably. 

Validity. Beginning in 19 jo, and in increasing numbers each year, 
except lor a decline during the last vear of the* war, studies ol the 1 elation- 
ship between Ruder scenes and other variables have* been appearing in 
the literature. Ac cot ding to the writer’s count, thete was one \aIidation 
study published in 19 jo, two in 19 p, two in 1942, four in 1919, five in 
1944, one in 1945 (by which time publication lag had piesumably caught 
up with the absoiption of psychologists in the' war effort), three in 19 }f>, 
and six in 1947. All but one were bv persons not direct 1 \ connected with 
the imentory, for Ruder has tended to publish his findings only in the 
manual. T his demonstrates the recognition on the* part ol the counselors 
and psychologists of the need for more evidence concerning the* \aliditv 
of a popular and promising instiument. 

Intelligence has not Irecjuentlv been correlated with Ruder scores, per¬ 
haps because other problems seemed more \ital. Adkins and Ruder (8) 
reported one study of the relationship of interest scores to primaly mental 
abilities, investigation which does have special inkiest because* the men¬ 
tal abilities measured were specific. Their data were obtained from 512 
university lrcshmen. The correlations between Ruder and PM A Lest 
scores were low, except for one of .99 between number ability and compu¬ 
tational interest, a readily understandable relationship. Triggs (870) cor¬ 
related the Ruder with A.C.K. Psychological Examination scores, also 
finding low correlations, except for one of .40 between literary interests 
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and verbal scores, and another of .40 between computational interests and 
cjuantitative scores, but these relationships held lor women only. Why 
they were not found in men, if not merely chance findings in women, is 
difficult to explain. Perhaps social pressure makes college men develop a 
modicum of computational interest regardless of special ability, whereas 
women do so only if they have unusual aptitude lot such work. But this 
would not explain the relationship between literary interests and veibal 
ability in women, who are normally both more verbally and mote 1 it- 
erarily inclined than men. More and better studies are needed to cTarily 
these matters. 

Aptitudes as measured by the Bennett Mechanical Comprehension 
and Minnesota Paper Form Boatcl Tests were con elated with Kutler scores 
in a study of 40 aircraft factory loremen by Sar tain ((>71). For the mechan¬ 
ical scale the two correlations weie .13 and . 15, for tlu* scientific scale .19 
and .15, high enough to show some connection, but too low to make the 
relationship practically important. 

Infnests as measured by Strong's Vocational Interest Blank ha\e been 
related to Ruder scores in a number of studies, particulaily in a series bv 
Triggs (870,871,872,99!)). Peters (597) first reported correlations ranging 
Iron) .98 to .52 for 2 j college women tested with the Ruder and Strong’s 
Women's Form. lire correlations between Ruder scientific and Strong 
physicians’ interests, computational and office 'workers’ interests, literary 
and authois’ interests, and social sen ice and lawyers’ interests (hea\ilv 
loaded with the “people” factor in women) 'were significant, as one would 
expect. So also was that between scientific and lawyers’ interests, which 
is difficult to explain, except on the grounds of their common correlation 
with intelligence as shown by Strong. 

Male subjects presided the basis of Triggs’ final study (871), in which 
the trends were similar to those tepoited for women by Peters. For these 
ib!i men the 1 relationships lor typical, presumably similar, scales were 
as gi\en in Fable 9 t. 

These relationships tend to be what one would expect, but they arc 
low enough so that it would not be possible to use one instrument as a 
substitute for the other, as many had hoped would be possible. On the 
other hand, the vat sing degrees of relationship make it possible to use 
either inventory with be tter understanding of what is being measured, 01 
both inventories together in order to make a more penetrating analysis 
of a client’s interests. 

The existence of a higher degree of relationship between the Ruder 
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scientific and Strong chemist scales (.73) than between the mechanical 
and the chemist scales (.51), when contrasted with the inverse order of re¬ 
lationship for the Strong engineer scale (.54 and .72), suggests that the 
Kuder scientific scale assesses a more theoretical, laboratory, or biological 
type of interest than does the mechanical, and that in testing a would-be 
engineer it is well to attach more weight to the mechanical scale, while 
for a would-be chemist the scientific scale should be stressed. It is note¬ 
worthy that Kuder has revealed an awareness of these relationships in his 
occupational classification in the manual (446:5-8), lor chemist is placed 
in the scientific group, while the various engineers are placed in the 
mechanical-scientific. It would have been even more accurate, judging 


Table 31 

CORRELATIONS BETWEEN KUDER AND STRONG SCALES 


Strong Si alt 

Sci. 

Meek. 

Physician 

•5° 


Psychologist 

.36 


Engineer 

•54 

•72 

Chemist 

■73 

•5i 

Carpenter 

.26 

.67 

Math.-Sci. Teat her 

•47 

.46 


YMCA Sec'v 
Social Sci. Teacher 
City School Supt. 
Accountant 
Office Worker- 
Life Insurance Sales 
Lawyer 

Author-Journalist 


Kuder 

Sac. 

Serv. Cornpi ler. J’ers. Lit. 


•35 

•3° 

.42 

•49 

•25 -38 

•58 

•5° 

.28 


by these data, to place the chemists in a scientific-mechanical group (note 
the order) and leave only the more purely biological occupations in his 
scientific group. 

"l'he almost identical correlations between the Strong mathematic s-and- 
science teacher scale on the one hand, and the Kuder scientific and 
mechanical scales on the other (.47 and . j6), provide an interesting con¬ 
trast with both of the sets of relationships discussed in the preceding 
paragraph, and the closer relationship between the carpenter and me¬ 
chanical scales as compared with that between the carpenter and scien¬ 
tific scales (.67 and ,26) further strengthens the interpretation suggested. 

The clerical scale correlates more closely with both the accounting 
and the office work scales (.55 and .38) than does the computational (.49 
and .25). This might be taken as a reflection on the computational scale, 
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hut it should be remembered that a good measure of a factor need not 
necessarily have the most significant relationship to any variable in 
which it plays a part: in other words, the computational factor may be 
a very real one in some occupations, which actually can be classified in 
an occupational field in which other factors are more important. Ac¬ 
countants do computational work, but they are also concerned with other 
aspects of office work and record-keeping, as reflected in their clerical 
interests. 

The much higher correlation between literary and lawyer (.50) than 
between literary and author-journalist (.28) scales is worth noting, for it 
suggests that the Ruder literary scale is likeh to be more valid for legal 
than lor literary occupations. Strong’s factor analysis of his scales (775: 

113 and 319) shows that his lawyer and author scales have approximately 
the same loading of his “things vs. people” factor (—.92 and —.98), while 
the lawyer scale has a slightly heavier loading of the “system” (.26 vs. 
— .19) and light loading of the social welfare (—.22 vs. —.01) factors. It is 
difficult to rationalize these two sets of data. More investigation of the 
difierences between Ruder and Strong scores is clearly needed. 

Counseling experience has suggested (802) that the apparent discrep¬ 
ancies between Ruder and Strong scores may have diagnostic significance. 
Some persons who made high persuasive scores on Kuder’s inventory but 
low life insurance salesman scores on Strong’s seemed on the basis of case 
history and interview material to be interested in promotional activities, 
but to dislike activities in which they need to push people to the point 
of action as in closing a sale. The diagnosis and counseling of a number 
of dients on the basis of this interpretation of differences between persua¬ 
sive and salesman scores has seemed fruitful, in a few cases even dramatic, 
but too few ha\e been handled to justify any conclusions. It is also 
possible, for example, that such discrepancies are the result of effects 
such as that described by Paterson (58b), and that the higher Ruder 
persuasive score is the result of self-delusion or of an attempt to impress 
the consultant, while the lower Strong salesman score reflects more ac¬ 
curately the true interests of the client. If this were the case the selection 
of salesmen could be improved by using both inventories and devising an 
index of distortion based on discrepancies between the two scores; the 
better salesmen w r ouId presumably be those whose discrepancy scores were 
smallest. The hypothesis would be worth testing. 

Personality traits have, we saw in connection with Strong’s Blank, 
generally been assumed to be related to interests. This hypothesis w r as 
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checked by Evans (243) wi th the Minnesota Personality Scale and the 
Minnesota T(hinking), S(ocial), E(niotionaJ) Inventory, in relation to 
the Kuder Preference Record. She tested 190 women students at Indiana 
University, and reported that social introverts tended to score low on the 
Kuder persuasive interest scale, as did thinking introverts, while extro¬ 
verts of both types tended to make average or high persuasive scores. 
Thinking extio\erts were low also on literary interests, although thinking 
introverts made aveiage scores on the literal y scale*. Triggs (873) cor¬ 
related the scores of 35 male and bo female college students on the* Kuder 
and on the Minnesota Multiphasic Personality Inventory, finding that in 
men mechanical interests were significantly and negatively correlated 

Table 32 

CORRELATIONS BETWEEN SCORES ON THE PREFERENCE RECORD AND 
ON THE MINNESOTA MULTIPHASIC INVENTORY FOR 35 MALE 
STUDENTS FROM TIIE UNIVERSITY OF WASHINGTON 
FROM TRIGGS (873), UNPUBLISHED PAPER 

W- / I. /• 

(i 

1 Mechanical 

2 Computational 
y Srirntifn 

4 Persuasive 

5 Artistic' 

(> Literarv 

7 Musical 

8 Social Service 
cj Clerical 

Level of significance 
*5 f > =■ Vh 

with psuhopathic and feminine tendencies, computational interests with 
paranoid, scientific with paranoid and psychasthenic, and social service 
with depiessed tendencies, while musical interests were significantly and 
positively related to psychasthenic and schizophrenic, clerical interests to 
depressed, psychasthenic, and schizophrenic tendencies. Her data ate 
reproduced in Table 3 2 . In women no significant 1 clationsbips weic 
found between interests and personality trails, although two relationships 
with validating scores were significant. 

In view of the currently prevalent idea in guidance centers that social 
service scores on the Kuder are an indication of personality maladjust¬ 
ment Triggs’ findings arc especially worthy of note: social service 
interests arc shown to accompany wholesome rather than unhealthy 
personality patterns. This docs not disprove the observation that some 
people who want to enter social, educational, or psychological work of 
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one kind or another are not maladjusted, hut it does make one seriously 
question the tendency to look on high social service scores as indices of 
maladjustment. There is more justification for seeking other signs of 
disturbance in persons with high musical or clerical scores, but even here 
the relationships are low enough to make it clear that there are many 
exceptions. Indeed, experience with clients leads the writer to discount 
musical, artistic, and literary scores unless there is good supporting evi¬ 
dence* in the* case history; it seems that many people without highly 
developed interests make high scores on one or more of these scales, 
presumably because* most high school and college graduates enjoy listen¬ 
ing to some kind of music, looking at some kinds cal pictures, and reading 
fiction enough to seem interested in one of these fields if other more 
definite interests ate lacking. 

Guides or other indices of academic achievement have been correlated 
with Ruder scores in at least ten studies. Triggs (870) found correlations 
of . j2 (women) and .92 (men) be tween scientific interests and general 
science achievement, . jo (men) and .y> } (women) between literary intei- 
ests and achievement in English literature, .31 (men) and .36 (women) 
between computational interests and mathematical scores. Yum (952) 
found significant relationships between the literary interests and grades 
of men (.99*,) a,, d between the computational interests and average grades 
of women (.295) at the University of Chicago, but the comparable rela¬ 
tionships for the opposite sex were in each case not significant. Crosby 
(18 1) reported significant diflerences between the chemistry and biology 
grades of high- and low-scoring scientific interest groups (critical ratios 
“ 7.(1 and 12.2), and between the accounting grades of high- and low- 
scoring computational intciest groups (6.9). The 191b manual cites a 
thesis bv Mangold (506), in which she found significant relationships be¬ 
tween scientific interests and scores on the co-operative Natural Science 
Test (.^>85), literary interests and Co-operative English "Test scores (.31) 
and literary interests and literal) scores on the Co-operative Contempo¬ 
rary A flails "Test (.59). Detchcii (199) developed a scale based on 109 of 
the 785 Ruder items which weie found to differentiate A and B students 
from 1) and E students, and obtained a validity coefficient of .60 with a 
social science comprehensive examination as her criterion; her subjects 
were 2 17 students in the original group, 106 in the cross-validation group 
for whom the validity coefficient shrank to a still significant .57. The 
typewriting and stenography grades of women liberal arts students, 96 
and 75 in number, were related to Ruder clerical scores by Barrett (45), 
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who found that the interest scores did differentiate superior (A and B) 
stenography students from inferior (D and F) students, the cut-off score 
being the 55th percentile; the scale had no validity for typing. Dentistry 
grades were the criterion used by Thompson (825); he found no relation¬ 
ship (—.06) between mechanical interests and dental practicum, while 
the validity of the social service scale was .24. These seemingly odd results 
may perhaps be explained by the very high mean mechanical interest 
scores (91st percentile) and their restricted range, whereas the social serv¬ 
ice scores had a lower mean (67th percentile) and presumably a greater 
range. On the other hand, scientific interests correlated .28 with theory 
grades, as anticipated. 

Achievement on the USA 1 T Tests of General Educational Development 
was related to Ruder scores in a well designed study by Frandsen (271). 
Achievement in the natural sciences correlated .31 with computational 
and .50 with scientific interests; in the social studies, —.37 with social 
service but .31 with literary and .34 with scientific interests, probably 
because of the respectively negative and positive correlations between 
those types of interests and academic ability. Frandsen cites a masters 
thesis in which Turner reported correlations of .29 and .32 between 
scientific interests and grade-point-ratio in se\eral coinscs in the bio 
logical and physical sciences, and .49 between computational interest and 
grades in physical sciences. On the basis of his and other findings Fraud- 
sen appropriately concluded that “science and mathematical interests 
are definitely related to general achievement in parallel areas. For other 
areas, significant and logically consistent interest-achievement relation¬ 
ships have not been so clearly indicated, though some slight relationships 
have been noted for literature and social studies.” Exceptions, Frandsen 
goes on to state, appear to be due to more fundamental negative rela 
tionships between social service interests and mental ability. 

Completion of Training. From this point Frandsen proceeded to 
check Suong’s hypothesis that interest would result in remaining in 
rather than leaving a field of endeavor, by correlating Ruder scores with 
percent of total credit in scientific and social studies. The correlations 
are shown in Table 33. 

These data support Strong’s hypothesis: students with social service 
»ntcrests tend to choose more social studies courses, and students with 
scientific interests tend to elect more scientific courses. Further confirma¬ 
tion is found in a study by Bolanovich and Goodman (109), in which the 
engineering grades of f>6 women students of electronics in the Radio 
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Corporation of America’s war-training program con elated only .09, .18, 
and .10 with mechanical, computational, and scientific interests on the 
Ruder, but the scientific and computational interests of the cadettes who 
successfully completed training were significantly higher than those of 
women who did not complete it, while those who were released scored 
significantly higher than others on the persuasive scale. These two studies 
seem to provide convincing evidence that what Strong found with his 
inventory is also true of the educational predictive value of Kuder’s. 

Occupational choice was related to Ruder scores by Crosby and Winsor 
(185), by Ropp and Tussing (441), and Rose (647). The first authors 
asked college students to estimate their interest in the seven types of 
activities measured by the then current form of the Preference Record, 
and correlated these estimates with scores on the interest inventory; the 
average coefficient was .54, and there was more agreement between the 
two indices for the more intelligent (as measured by the A.C.E.) than for 

Table 33 

COKRLl A 710 NS BF .7 W I 1 N kl’DI.R SCORES AND CHOICE OF COURSES 
Percent Total Credit in 

huder Intelest Stall \atural Sciences Social Studies 

Scientific .54 — .35 

Social Service — .17 .32 

the less intelligent students. Ropp and Tussing found similar resuUs 
with apptoximately r,o high school boys and an equal number of high 
school girls (1 = .59 and .50), using the nine categories of the ie\ised 
Preference Record. Rose used a similar procedure with 60 \etcrans, find¬ 
ing a correlation of .fit between imentoried and expressed preferences. 
Those who had specific objecthes showed no closer agreement than 
others. About two-thiuls of the group preferred occupations in fields in 
which they made high scores. These results are consistent with those 
a heady seen for Strong’s Blank. 

Success in an occupation has been correlated with scores on the Rudci 
in only three published studies at the time of writing. J11 the first of these. 
Sartain (671) administered a battery of tests to 40 foremen and assistant 
foremen in an aircraft factory who were rated by their supervisors. The 
1 a tings had an inter form reliability of .79, but yielded significant cor¬ 
relations with none of the instruments; that for the Ruder mechanical 
scale was .07, social service scale —.06, and clerical scale .003. In the* 
second study, initiated by the writer and reported by Guilford (316:613- 
bifi), the Ruder was administered to 937 AAF pilot cadets who later took 
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primary training. The correlations with success in training were statisti¬ 
cally significant for only one scale, and that coefficient was only —.10, 
between social science interests and success. The validity coefficient for 
the mechanical scale was only .02, and the musical and artistic scales, 
which on the basis of results from information tests and biographical 
data blanks would be expected to have negative validities, actually had 
low but nearly significant positive validities (.05 and .08). Guilford sug¬ 
gests that this is because the Ruder scale's sample* inteiest and apprecia¬ 
tion, whereas the more valid (lot pi edit ting success) tests of information 
and biographical data sample experience. Thompson (826) found supe- 
1 ior management engineer executives more interested in mechanical and 
less interested in social service activities than average men in similar 
jobs. 

In an unpublished study, reported in an abstract of a paper by Di- 
Michael and Dabelstcin (200), efficiency latings of 100 vocational icha- 
bilitation woikers weie correlated with Ruder scales. ()1 jS 1 elationships 
computed, the first two of those which follow were* significant at the* one* 
percent le\el. the third at the 5 percent level: 

1 piomotional woik and persuasive interest scoie — .32, 
r piofessional leading and scientific inteiest score — .26, 
r employer contacts and persuasive inteiest scoie — . icj. 

These findings suggest that although uncorrelated with overall success 
in a job, inteiest as measured by the Ruder may be lelated to success in 
some aspects or duties of a \aried job. 

Occupational chfjocntiation on the basis of Ruder scenes has been 
most extensively reported in the manual, in which Ruder reports patterns 
lor a number of men’s and women’s occupational groups. These ha\e 
already been discussed in connection with the norming ol the Preleience 
Record; it was pointed out that the- numbers in each field are distiessingly 
small, and the fact that the selection of the samples ol each occupation 
is not made clear suggests that it was opportunistic lathei than planned. 
It has also been seen that in the case of one women’s occupation, musing, 
increasing the size of the sample made little difference in the mean or 
standard deviation. Brief verbal summaries ol the patterns icvealed in 
Ruder’s work are given below, as a tentative* guide to the interpietation 
of the scores. 

Men in social welfare occupations, e.g., vocational rehabilitation super¬ 
visors, clergymen, social workers, school administrators, and teachers of 
social studies in high schools, tend to make high social service and 
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literary scores; personnel managers, however, are somewhat less distin¬ 
guished by high scores on these areas, and unlike the social welfare group 
tend to make equally high scores on the persuasive scale. 

Men in literary occupations, such as writers, English teachers, and 
actors, tend to make high literary and musical scores, but actors are also 
high in artistic interests; lawyers and judges differ even more in that they 
make high scores in the persuasive area as well as in the literary and 
musical. 

Scientists such as chemists and engineers tend to make high scores on 
the scientific scale, electrical and especially industrial and mechanical 
engineers also making high scores on the mechanical scale. The computa¬ 
tional scores of these groups are higher than average, but only in the 
case of the industrial engineers are they significantly high. The only 
significantly high score made by the sG draftsmen was in the artistic 
area. Spear (730) found similar trends in engineering freshmen, as did 
Baggalcy (36) with liberal arts college freshmen. 

Clerical workers, including accountants, auditois, bookkeepers, and 
cashiers tend to make* high computational and clerical scores, the higher- 
level groups being outstanding in computational and the lower-level 
groups in clerical interests. 

Salesmen and sales managers make their highest scores on the persua¬ 
sive scale, this being the onh outstanding score of salesmen who sell to 
individual consumers, while those who sell to distributors or manufac¬ 
turers tend also to make high clerical interest scores. Judging by pattern 
inspection, life insurance agents (\ = 24) do not difler appreciably from 
other salesmen, a finding which is at variance with Strong’s data, previ¬ 
ously disc ussccl. 

The patterns for women are in most cases similar to those of men in 
the same field, and like the men’s, they tend to agree with expectation. 
Women phvsitians tend to make high scores in scientific and mechanical 
fields, as do laboratory technicians, hut neither group is high in computa¬ 
tional interests (no similar men’s groups were tested). Nurses make their 
high scores in scientific and social service areas, but it is noteworthy, in 
view of Strong’s findings concerning some women’s occupations, that 
none of the means are as high as the 75th percentile; in other words, they 
are a relatively undiflcrcntiated group. This is true also of women tele¬ 
phone operators, stenographers and typists, teachers of home economics, 
and teachers of soc ial studies, as Strong’s work would lead one to expect. 

Groups of 50 male life insurance salesmen and 50 social workers were 
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tested by Lewis (469), who found the former significantly higher than the 
general population in persuasive and the latter significantly higher in 
social service interests. Profile analysis was not made, however. Lehman 
(465) followed up students of home economics at Ohio State University, 
finding from 10 to 125 in each of several subdivisions of the field. Teach 
ers, the largest group, scored high on social service, artistic, and scientific 
interests; hospital dieticians were high in social service, scientific and 
computational areas; restaurant and tea room managers scored high on 
the artistic and computational scales; home service and equipment work¬ 
ers made high scores in social service and persuasive fields; and journalists 
in the literary and artistic fields. Women marines were tested by Hahn 
and Williams (323), who found relationships between interest patterns 
and duty assignments which, like those just reviewed, were in line with 
expectation. 

Job satisfaction has so far been used as a criterion only by Hahn and 
Williams, in the study just referred to and by DiMichael and Dabelstein 
(200). 1 he former found that satisfied clerical workers were significantly 
more interested in clerical activities as measured by the Kudcr than were 
dissatisfied clerical workers, the critical ratios for three sub-groups being 
2.28, 2.41, and 2.97. Clerk-typists who were dissatisfied tended to be more 
interested in mechanical matters; general clerks who were satisfied were 
also more interested in computational activities. 

DiMichael and Dabelstein (200) correlated satisfaction with various 
job duties, as rated by 100 vocational rehabilitation counselors, with 
scores on appropriate Kudcr scales administered five months previously. 
The correlation between enjoying “contacting employers to secure jobs” 
with the Kudcr pcisuasive scale was .28, and between “handling clerical 
details” and clerical interest scores .32. None of the expected relation¬ 
ships between social service aspects of the job and social service interests 
were significantly correlated for this group. Another group of .jfi male 
counselors were tested after they had made the job satisfaction ratings, 
and it is interesting that here the correlation between enjoyment of the 
job as a whole and social service interest score was .29 as opposed (o .13 
for the other group, that between enjoying interviewing clients and social 
service interest score rose from .06 to .43, and other expected relation¬ 
ships became closer. This might be attributed to cither of two factors: 
The first group may have lacked insight into their interests when they 
filled out the Preference Record, the subsequently completed satisfaction 
questionnaire therefore being a more accurate picture of their interests. 
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Or the second group, having filled out the satisfaction questionnaire first, 
may have answered the interest inventory more searchingly and insight¬ 
fully, perhaps even distorting answers in order to make them consistent 
with what they had already said. As the first group had had id5 veais 
experience on the job, and the second only 1 year, it does not seem likely 
that the first explanation is correct. The first group knew its work, but 
did not know it was going to rate it for job satisfaction; the second group 
also knew its work, though less well, and had already rated it for satisfac¬ 
tion. The closer agreement between the two indices in the latter group 
must therefore be related to having job satisfaction in mind when the) 
took the preference Record. It would be interesting to know to what 
extent the greater agreement represents, respectively, stereotyping, irr 
sight, and distortion. 

Use of the Kudcr Preference Record in Counseling and Selection. It 
has been established that the traits measured by the Kudcr are internally 
consistent and relatively independent of each other. They are not closeh 
1 elated to intelligence, although there appears to be a degree of relation¬ 
ship between some primary mental abilities and the expected interests. 
Similarly, special aptitudes such as mechanical comprehension seem to 
be somewhat related to appropriate interests. The relationships between 
Ruder and Strong scores are found according to expectation, but they are 
not high enough to justify using Kuder scores as though they were ob¬ 
tained from Strong’s Blank. The reason for this is obvious enough: the 
Ruder scores measure relatively pure interest factors, whereas Strong 
scores measure the interests of people in occupations. Chemists, ioi 
example, are characterized by interests which arc partly scientific and 
partly mechanical, while mechanical engineers have a combination ol 
mechanical, scientific, and computational interests. Personality traits 
have also been found to be related, in some instances, to interests as 
measured by the Preference Record: contrary to a commonly held opin¬ 
ion among vocational counselors and psychologists in guidance centers, 
interest in social service is related to wholesome personality patterns 
as measured by the Minnesota Multiphasic Personality Inventory, as are 
mechanical interests; on the other hand, the personality patterns asso¬ 
ciated with musical and clerical interests are not so healthy. 

The development of interests as measured by Kuder’s inventory is not 
clear. Data so far collected indicate that there are no significant changes 
associated with age during high school and college years, but as this 
tentative finding is contradicted by the much more intensive and exten- 
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sive work with Strong’s inventory it seems well not to draw any conclu¬ 
sions until after more thorough-going studies are completed. 

Occupational significance of scores on the Preference Record has been 
uemonstrated largely by compilation of means and sigmas for people 
employed in various occupations. Although the numbers are small, the 
data indicates differences between groups such as would be hypothesized 
on the basis of Strong’s results. The development of occupational indices, 
or procedures for the statistical comparison of an individual’s scores with 
those of people in various occupations will make the occupational inter¬ 
pretation of the Kuder more objective, but it will take some time to make 
an appreciable number of these available. In the meantime we have seen 
reason for thinking that Kmlcr's classification ol occupations by interest 
ivpes has some validity, although the published mateiials indicate that 
as yet much of the classification has no empii ical basis. The little* mate¬ 
rial available on the relationship between Kuder scores and success on 
the job is less encouraging than for Stiong’s inventor), although one 
study has shown some relationship between interest and success in appio- 
priate duties or aspects ol the job. 

In st /tools and colleges the Kuder does seem to hate leal possibilities 
even for the prediction of success in courses, for scoies are signific antlv 
related not only to the completion of training, as lor Stiong’s blank, but 
also to grades in some appiopiiate subjects, specificallv the* scientific and 
mathematical. Validity lor other subjects is moic doubtful, at least when 
the interest-range is as restricted as it generally is. The scorability of this 
inventory, the ease with which student pai tic ipation in scoiing, convert¬ 
ing scores, and plotting profiles lends itself to interpielation of results 
and discussion of their implications, give the* Kuelei main advantages 
lor use in school and college guidance programs. Its transparency is pic- 
sumably less important in counseling than in selection piogiams, and the 
fact that scores have only modelately high correlations with expiessed 
prefeiences shows that it can contribute something to the diagnosis of 
interests, especially for the least able students for whom the discrepancy 
between choices and scores is greatest. 

In guidance centers, whose clients are generally somewhat moie mature 
and more experienced than students, it is especially desirable to make a 
careful study of the manifest interests of clients to whom the Kuder is 
administered, as a precaution against overemphasis on the literary, musi¬ 
cal, and artistic scores which seem often to be high simply on an apprecia¬ 
tion basis. Even in schools this can be similailv checked, but there the 
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counselor may need, and be able, to depend partly on try-out experiences. 
Differences between Ruder and Strong scores oiten suggest new interpre¬ 
tations worth exploring in interviews, making the use of both instruments 
desirable in difficult cases. 

The value of the Ruder in employment centers and in business and 
industry is still virtually unknown, as it has been little used in such 
situations. Despite its “industrial" short form it was apparently not de¬ 
signed with such use in mind, and its transparency has militated against 
it. For it to be valuable in personnel selection or evaluation programs 
more restate!) should be done, including studies of the extent of faking 
among applicants, the possibility ol a distoition score and the develop¬ 
ment of occupational indices appropriate to the jobs of the specific 
company or institution. 

The AUf)ort'Vernon Study of Values (Houghton-Mifflin, 1931) 

This inventors was developed by G. W. Allport and I\ E. Vernon in 
an attempt to measme the personality traits postulated by Spranger in 
his 'Types of Men (73 j). 1 lie touts measured are best described as values 
or evaluative attitudes, although some ol them verge on needs (see next 
chapter). We have seen that they closely resemble interests but are per¬ 
haps coiiectlv described as more basic, for thev concern the valuation of 
all types ol activities and goals, and thev seem in some instances to be 
more closely related to needs 01 drives. In practice, however, values and 
interest inventories are oiten used more or less interchangeably, and their 
relationships warrant treating them as interest inventories. The Allport- 
Yernon is bv no means the- onlv value's test, but it is the first of its kind, 
has been the most thoroughlv studied, and is still the most widely used. 
A review of work with this and other values tests was published in 1940 
by I)uil> (i?i(i). 

Applicability. A he Study ol Values was designed for use with college 
students, and more as an instrument lor research in the theory and 
organization of personality than as a practical aid in counseling or selec¬ 
tion. Its vocabularv level is therefore higher than that of most inven¬ 
tories; Stef Hr e (7,72) has shown that it has a vocabulary grade placement 
ol 1 1.3, and that onlv the Clccton Vocational Interest Inventory, among 
the widely used blanks, is more difficult to comprehend. For these reasons 
the AIlport-Ycrnon should be used only with superior high school juniors 
or seniors, college students, or superior adults. Even for these some of 
the items may be difficult to accept, if not to understand, because ol their 
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seemingly esoteric nature. College students usually take them in their 
stride, but employment applicants are often impatient with some of the 
mystical and aesthetic items. 

Changes in scores during the four years in college have been studied 
by Harris (342), Schaefer (674), and Whitely (922), and summarized by 
Duffy (216) as showing that: . . the lowest coefficients of correlation 

are found always between the first and other administrations of the test, 
and that the trend (perhaps not statistically significant) is toward an in¬ 
crease in aesthetic, social, and theoretical values, and a decrease in reli¬ 
gious, political, and economic values/’ Subsequent studies by Arscnian 
(32) and Burgcmeister (124) with college men and women do not alter 
these conclusions, which fit in with conclusions concerning the increase 
in social welfare interests with age in adolescence, but contradict other 
data on scientific interests, and have no counterpart in so far as aesthetic 
and other values are concerned. It may be that the increases in aesthetic 
and theoretical interests, and decreases in religious and other values, are 
the result not of maturation but rather of college experiences. It would 
be helpful to have retest data lor these same persons five and fifteen years 
after graduation from college, but none are available. Neither are there 
studies of age changes in other more typical populations. 

Content. The Allport-Vernon consists of 45 items, the first 30 of 
which are paired comparisons and the last 15 multiple-choice, making 
120 alternatives in all. As in the Kudcr Preference Record each of the 
choices represents one of the types of interests or values; and the cor¬ 
rected sum of the examinee’s choices of any one kind of item constitutes 
his score for that type of value. As in the Ruder, a higher score on one 
type of value automatically makes for a lower score on some other type 
or types. The items are designed to tap theoretical (interest in truth and 
knowledge), economic (interest in the useful or material), aesthetic (in¬ 
terest in form and harmony), social (interest in social welfare), political 
(interest in prestige and power), and religious (described as interest in 
unity with the cosmos but actually adherence to the forms of religion) 
values. The use of Sprangcr’s esoteric terminology has created many mis¬ 
understandings of the traits measured, not only among users of the test 
but also in some investigators who have taken the terms in their common 
rather than very special sense. Even Spranger’s definitions are misleading, 
as just noted in the case of religious values, because of poor implementa¬ 
tion of the authors’ intentions. The writer has frequently noted, for 
example, that high school students from traditionally religious homes, in 
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whom observation and study revealed no real depth of religious feeling 
or belief, make high religious scores on the Study of Values. In their 
< ases the scale seems to measure only verbal conformity to formal reli¬ 
gion. Any user of the inventory should therefore study the items carefully, 
as well as the authors’ definitions, before making interpretations. 

Administration and Scoring. The blank requires from 20 to 40 
minutes to administer, depending upon the verbal ability of the exami¬ 
nee. There is no actual time limit, but rapid woik should be encouraged. 
Directions are simple and clear. Scoring is by means of a self-explaining 
scoring and profile sheet, readily understood by college students. Final 
raw scores may be converted into deciles by a table on the profile sheet, 
but the small number of items makes the conversion very crude and 
complicates interpretation. The plotting of final raw scores on the profile 
sheet brings out the dominant values more effectively and with less exag¬ 
geration, and is to be recommended for use. This scoring procedure is 
more time-consuming than most in current use, but this is of minor 
importance when the inventors is used as a part of class work and is 
scored by the students. The use of the profile is helpful in stimulating 
discussions of \allies and goals, and in bringing about self-insight. 

Xonns. Hie college student norms provided by the manual have been 
found reasonable adequate in a number of studies (342,136), with varia¬ 
tions which seem explainable in teams of the clientele and emphasis 
of the colleges in question. But these norms are general, and serve, like 
Kudcr’s, only as a backchop against which to study the variations of part 
scores within an individual. Occupational norms are also desirable, in 
order to throw light on the vocational significance of the scale, but are 
not available except for 26 YWCA secretaries (17). On the other hand, 
tlu* mean scenes made by a great variety of college curricular or pre- 
oc< upational groups have been reported in various studies referred to 
below in the section on occupational differences. These lend support to 
the practice of interpreting Allport-Yernon scenes in vocational terms. 

Stan dm dization and Initial Validation. The diagnostic efficiency of 
the inventory was tested by the internal consistency method in the origi¬ 
nal study (898), in which it was found that the scales were relatively 
reliable and independent, only the social values scale being of question¬ 
able reliability (.65). Scores correlated .53 with students’ self-ratings on 
similar traits on the average (range of r’s = —.06 to .69), even though the 
reliability of the ratings was only .59, suggesting consistency between 
most self-concepts and self-described behavior. Idle one low intcrcorrela- 
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tion was for social values. Expected differences were found between au¬ 
ricular groups, science majors, ior example, being high on theoretical 
and low on economic values, while business students tended to scoie 
high on economic values. 

Reliability. As previously noted, the reliability ol the social values 
scale was found to be only .65 (<898), but the average retest reliability 
after three weeks was .82, showing considerable stability in the otliei 
scores; these findings of the original authois have since been con finned 
by other investigators (13b). 

Validity. Scores on the Allport-Vernon have* been related to most of 
the variables which can be studied in college populations, although to 
relatively few which are observable only in other gioups. 

Intelligence test scores ha\e been correlated with values scores, for 
example, bv Pintner (boh) in a study of 33 graduate students of educa¬ 
tional psychology, lor whom the correlations weie .2 j with theoietic al. 
.38 with social, —.28 with political, and —. ji with economic \allies, those 
with other \alues being practically zero. Othei studies, summaii/ed in 
the manual, in Cantril and Allport (13b), and in Dutlv (mb) reveal 
similar trends except lor social values, the Jesuits loi which are generally 
not so cleatly positive. 

(Trades weic* used as a criterion in Pintnei’s study (tiob), but as they 
were based partly on performance in test administration they are some¬ 
what atypical: social values con elated . jb with giades. while the other 
coeflicients were so small as to be negligible. Cant til and Allport (13b) 
found theoretical values correlated with sociology giades at Daitmouth 
to the extent of .25. In a study of students at Sat ah Lawience Colle ge, 
Dully and Crissy (217) found a valiclitv of .3 j for a combination ol values 
scores, using ratings of academic achievement at the end ol the heshman 
year as their criterion. Theoietical and aesthetic values had positive 
weights, economic and political negative. W ith the Co-opciativc Test of 
General Culture as a criterion, Schaefer (by]) found 1 elationships of .58 
and — -17 between the liteiarv achievement and aesthetic and economic 
values of 51 women sophomores, . ]y and —.28 between fine* arts and 
aesthetic and economic values, .37 and —.37 between history and the* 
same values, and .31 between general science and theoretical value. 
These relationships seem unduly high, and may be peculiar to the local 
situation (Reed College); they would in any case need confirmation be¬ 
fore being applied elsewhere. A sale generalization bom the studies 
reviewed would seem to be that there is a slight tendency for students 
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with theoretical values to make better grades than students in whom 
other values aie dominant, a conclusion which is congruent with the 
definition ol the trait, and that in some situations other values will be 
associated with success in appropriate fields of endeavor. 

Success on the job has not, to this writer’s knowledge, been related to 
scores on the Allport-Vcrnon Study of Values. 

Occupational differenc es have not been studied by means of employed 
men or women, but numerous studies have shown that professional 
students are dillerentiated by the Study of Values in accordance with 
expectations. Theoretical values are found in students of education (342), 
engineering (3 ji^), medicine (312,763), natural science (674), and social 
studies (671). Economic values character i/c only students of business 
(7^3’^7 l)- Aesthetic values are strong in students of drama (293), educa¬ 
tion (763), literature ((>71.763), and the scj>c ini studies (674). Social values 
have not so fretiuently been studied, as the scale is not reliable enough for 
individual diagnosis; it is adequate lor the study of group trends, which 
show that YWCA secretaries (17) stand high on it, but, surprisingly, that 
students majoring in the 1 social studies (67 j) tend to make low scores. 
Political values are significantly high in engineering students (3J2), physi¬ 
cal education students (695), and law students (312,763). Religious values 
have been found to lx* high in seminarians (492) and in YWCA secretaries 
(17), but the* high scenes of high school commercial students (87]) and 
low scoic-s ol college students ol business (76)3) suggest that the religious 
values scoies do not, in some cases, represent more than the lip service ol 
immature persons who have as vet experienced neither deep religious 
leeling nor intellectual doubts concerning religion. 

Satisfac tion in one’s work has not been related to scores on the Study 
ol Values, as might be expected in view ol its limited occupational use. 

I’sc of the All port-V a non Study of Tallies in Counseling and Selec¬ 
tion. The traits measured by this inventory resemble those measured 
by the other inventories studied in this chapter. Like the Kuder, it taps 
interest lac tors which plav a part in a variety of occupational fields, 
usually in wavs which would be anticipated in view of the nature of the 
items. However, the traits appear to be somewhat more fundamental 
and 11101c closely related to basic needs and drives than those measured 
by other interest inventories. 'They have been found to change somewhat 
during the college years, social interests increasing as other studies have 
also repented, Hut increases in theoretical and aesthetic values may be 
related to specific college influences, together with decreases in religious 
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and economic values. Too little is known concerning age changes in 

values. Values are related to intelligence in the same way as interests. 

Occupations for which the Study of Values has significance appear to 
be largely at the professional and executive levels, but that is due to the 
vocabulary and intended use of the instrument. Values are related in 
expected ways to choice of training in fields such as art, business, drama, 
education, engineering, law, literature, medicine, natural science, psy¬ 
chology, the priesthood, social studies, and social work. Only in the last- 
named field have experienced workers been tested, but the data for 
training groups are consistent enough to justify some confidence in their 
occupational significance. As no norms are available, the counselor must 
interpret on the basis of peaks and valleys in the profile, a procedure 
which is safer with this instrument than with most when drawing con¬ 
clusions from high scores because of the method of construction, but 
more dangerous with low scores or valleys in the profile because interest 
in such a field may be very strong even though pressed down artificially 
in the mutually-exclusive response technique. 

In schools and colleges this inventory may have some value in deter¬ 
mining appropriate fields in which to major, although it generally has 
less value for predicting grades than an intelligence test. The nature and 
degree of the relationships between values and grades in various types 
of courses are likely to vary with the institution, because of the impor¬ 
tance of climates of opinion in attracting students and in modifying 
values. Differences in predominant values or climates of opinion in 
different colleges give the test some value in helping students choose 
congenial colleges. The self-scoring feature of the inventory makes its 
use in orientation and psychology classes easy, and it lends itself well 
to the starting of discussions ol values, interests, and vocational objec¬ 
tives, such as is appropriate to orientation programs. The esoteric natme 
of some of the items limits its usefulness, however, to moderately well 
motivated persons, and the vocabulary limits it to superior high school 
and to college students. 

In guidamc centers the Study of Values can be helpful in aiding 
potential college students in the choice of colleges in which they will 
find the psychological atmosphere congenial and conducive to growth, 
although for this purpose comparisons between the mean scores of stu¬ 
dents in different colleges need to be made more systematically than has 
so far been done. A survey of the literature with this purpose in mind 
would yield some useful material. More important than this use, in 
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guidance centers, is the diagnosis of interests when it is suspected that 
Ruder or Strong scores are distorted by a clear-cut but inappropriate 
self-concept. The non-vocational nature of the Allport-Vernon items 
presumably makes them less subject to choice on the basis of vocational 
stereotypes, and more on their own merits, than the more clearly occupa¬ 
tional items in the Ruder and even the non-occupational parts of the 
Strong. Unfortunately this hypothesis has never been tested. Until it is, 
the clinical counselor in search of an understanding of a puzzling client 
cannot afiord to neglect this test and so to miss the chance to sink a shaft 
into the interest field which is slightly different from those sunk by other 
instruments. 

In employment services, business , and industry this inventory is likely 
to be less useful than in other types of counseling or selection situations. 
The vocabulary and subject-matter make it seem out-of-place to employ¬ 
ment applicants, and the norms and validation do not lend themselves 
to as effective use in selection programs as do those of certain other 
interest inventories. An industrial and business version might presum¬ 
ably be constructed and be of considerable value in selection because 
of the differences between it and the standard vocational interest inven¬ 
tories. but such a project has yet to be planned and carried out. 

The Clrrton Vocational Interest Inventory (McKnight and McKnight, 

! 937» 1 913) 

This inventory appears to have been developed in an attempt to 
simplify the scoring of Strong’s Vocational Interest Blank, and incorpo¬ 
rates many items used in it and in other inventories constructed in the 
Carnegie tradition. It has been rather widely used in schools, colleges, 
and guidance centers, but has not enjoyed the popularity of either the 
Sttong, despite its simpler scoring, or the Ruder, which captured a huge 
segment ol the vocational-test-using public almost on publication. The 
wiiter believes that this may be clue partly to warranted misgivings con¬ 
cerning the transparency of items grouped according to their occupa¬ 
tional significance, and partly to such an irrational thing as dislike of 
the meaningless and difficult-to-remember codes used to designate the 
occupational families. Whether scientific or not, convenient handles help. 

Description. r I'he Clccton Inventory was designed for use in grades 9 
through college, and with adults, but was constructed on the latter and 
has a vocabulary grade placement of 12 (752), making it the most difficult 
of the well-known interest inventories. Both men’s and women’s forms 
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consist of ten groups of items, each group represen!ing an occupational 
family (e.g., OCA: clerks, stenographers, typists, and other office work 
occupations) and consisting of 70 items, 30 of which are occupational 
titles, 20 names of school subjects, magazines, prominent persons, etc., 
and 20 leisure-time activities, work activities, and peculiarities of people. 
Scoring is done by adding unitary weights for each item marked in a 
given group. It was standardi/ed by administering it to some 7000 indi¬ 
viduals engaged in a variety of occupations, principally in the Pittsburgh 
area. In 7b percent of 1,741 cases the highest inventory rating agreed with 
the occupation engaged in, while in (jr, percent one of the three highest 
ranking groups included the occupation engaged in. 

The scores are quite reliable, ranging from about .82 to about pi 
(manual). However, as many have pointed out (622), the grouping of 
items by occupational families makes them easilv recognizable and spuri 
ously increases reliability: an examinee readily sees that a gi\en section 
is, e.g., the engineering section, reacts “I want to be an engineer, I like 
these,” and gives favorable responses to some* items which would be 
marked differently if they wete scatteieel among other items. I'nlortu- 
nately this hypothesis has not been checked experimentally, but counsel¬ 
ing practice suggests, and the relatively high reliabilities seem to con¬ 
firm the hypothesis, that this is a \alid criticism. 

I'ahdity. There ha\e been few studies of the \aliclity of the Cleeton, 
further testimony of the lact that it has not challenged most \ocational 
psychologists; most of the published studies are not concerned with the 
relationship between imentorv scores and external criteiia. It was 
administered to students of education by Congdon (i(>8), who lound 
significant differences between men and women who planned to leach, 
on the one hand, and who planned not to teach, on the other. Site also 
found that scoies in the field of claimed inteiest weie highei than scores 
in fields in which no inteiest was claimed, but this is not suipiising in an 
inventory as seemingly transparent as this. F.ven the fonner finding mav 
be spuriously high because of the same sort ol halo effect or stereotyping. 

The correlations between Cleeton scoies and Stiong’s scales were 
computed by Arsenian (214) for 150 Springfield College fieshmen who 
took the two inventories at intervals of one week (Slicing Blank first). 
Scores for the Strong scales which belong to the same occupational 
family were combined to yield group scores comparable to Cleeton’s, 
and the tw>o sets were correlated. The coefficients of correlation langed 
from .16 (LFJ and Lawycr-z\uthor-Journalist) to .(>8 (TMI) and the social 
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welfare scales), the average being ..\r y . Tin's is slightly lower than the 
con clarions between the Strong and the Ruder, which have less item- 
similarity than the Strong and the Clceton. It would clearly not be wise 
to use the Cleeton as a substitute for the Strong Blank, although there is 
considerable similarity in the inventories and in the meaning of the 
scores. 

Use of the Cleeton Vocational Interest Inventory. In view of the 
availability of more thoroughly studied inventories such as the Strong, 
the Allport-Vernon, and, more iccently, the Ruder, there is little justihea- 
tion for using an instrument concerning which there is still so much 
room for questioning and for which there is still little in the way of field 
\alidation. Although Cleeton’s standardization data are lather impres¬ 
sive, there has not yet been enough follow-through on the inventory to 
make it a well-understood instiuinent. 

The Lce-Thorpe Occupational Interest Inventory (California Test Bu- 
teau, 19 pp 

This new inventory has been available for so short a time that practi- 
tallv nothing has appeared concerning it in the professional journals. 
The writer has located no studies of its validity, and practical!) all that 
is known concerning it is in the manual and “Occupational Selection 
\id“ supplied with it. 

Dcsn iption. The items were written in simple language, with a 
\ocabuIan grade placement of only 6.8 (752); it (Advanced Form A) is 
thereloie easily understood b\ junior and senior high school boss and 
girls. The paired comparison form is easih handled also at that lcsel. 
1 he items aie not, howeser, oflensise to adults; they are based on the 
Dictionary of Occupational Titles (888), and so have the aura of authen¬ 
tic its. It is scored lot fields somewhat like Ruder’s, by simple item-count. 
The imentoiN itse lf therefore looks attrac tive to users of vocational tests. 
The manual shows that it is reliable (.71 to .93). The norms are based on 
1000 i2th-grade students, and are said to be applicable to am high 
school giade and to adults—a fact which seems improbable, in view of 
Strong and Carter’s work and of tentative findings reported by Lindgren 

Validity. The only claims for validity set forth by the manual arc 
based on the source of items, the design of the items, the balance’ of 
activities sampled, and the presentation of items. All of these, it should 
be noted, are internal, not external, criteria, and are dependent upon 
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the good judgment of the test authors rather than upon objective evi¬ 
dence. The inventory is therefore still in the embryonic stages, lacking 
evidence of occupational validity. Lindgren (473) has, however, reported 
a substantial relationship between appropriate Lee-Thorpe and Kuder 
scales. 

Use of the Lee-Thorpe Occupational Interest Inventory. The nature 
of the inventory makes it attractive to potential users, but it is at present 
a purely experimental form which has yet to be validated against occupa¬ 
tional criteria. It may therefore be used in research by those who have 
the resources for conducting validation studies, or as an interview aid, 
but has no value at this point as a diagnostic or prognostic instrument. 

The Michigan Vocabulary Profile Test (World Book Co., 1939) 

Unlike the other instruments discussed in this chapter, this is a test 
rather than an inventory. It is virtually the one information test of 
interests now available, although the Army Air Forces (316: Ch. 14) 
developed one which was quite valid for pilot and navigator selection 
and will no doubt stimulate civilian counterparts. The Michigan test 
was developed by E. B. Greene, as a test of specialized vocabulary which 
might be prognostic of interest and success in several fields of activity. 
It was little used before World War II, but has since been widely used 
in work with veterans. 

Description. Two forms are available, each of which was designed 
for high school and college use and has eight divisions: human relations, 
commerce, government, physical sciences, biological sciences, mathe¬ 
matics, fine arts, and sports. There are 240 items divided among these 
eight areas, each phrased as a definition followed by four terms from 
which the one which corresponds to the definition must be selected. 
Items are arranged in ten levels of difficulty, three items per level. An 
attempt was made to eliminate terms which could be guessed by knowl¬ 
edge of roots, prefixes, etc., thus reducing the effects of reasoning and 
restricting the test to information. The items were selected from more 
than Gooo submitted by students in the various fields. Groups of items 
were refined by internal consistency analysis, all items being required 
to correlate .30 or above with the score on that part. The inter-form 
reliabilities range from .78 to .94, with a median of .81. Administration 
is untimed, most college students finishing in about one hour and high 
school students sometimes requiring as much as one and one-half hours. 
The test can be machine or hand-scored with stencils; the score is the 
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number right for each part. A profile chart is provided on the answer 
sheet. Norms are expressed in percentiles, and are based on 4677 students 
from 9th grade through college, and are available for both part and 
total scores; this means that each norm group contains an average of 
slightly less than 600 persons. Because of the limited number of items 
in each scale, the percentiles change rapidly: a raw score of if) on the 
human relations scale places a college freshman at the 31st percentile, 
while one of 17 places him at the 50th. This is the unfortunate result ol 
a steeply graded test; it would probably have been better to have separate 
forms for high school and college, and to have more items working at 
each level in order to get a better spread of raw scores and of peicentiles. 
As it is, too much emphasis is put upon chance factors which affect the 
answering of any one item. Increases in scores with grade occur, as would 
be expected in a vocabulary test. Finally, profiles are given for students 
in several professional curricula, including law, nursing, engineering, 
business administration, medicine, education, and social studies, the 
numbers for these groups ranging from 125 to 182. These do not actually 
constitute norms, as only the means are given, but they do aid in inter¬ 
pretation. 

J'alulity. Unfortunately there have been almost no studies of the 
relationship between scores on the Michigan Vocabulary Profile Test 
and other variables, although data are needed on the relationships wi'h 
intelligence, inventoried interests, grades, completion of training, occu 
pational choice, success in various occupational fields, job satislaction. 
and other external criteria. It is surprising that an instrument which 
has been as widely used as this during the postwar years has had so little- 
publication; presumably this deficiency will be remedied after sufficient 
time has elapsed for analysis of the data accumulated by the veterans' test 
ing and counseling programs. One bit of internal c\idcncc concerning the 
validity of the test is contained in the manual, which shows that none 
of the part scores correlate more than .54 with any other, the aveiagcs 
for each scale ranging front .15 to .34. Thompson (826) has reported 
differences between more and less successful executives. 

Use of the Michigan Vocabulary Profile Test. Like many other 
published tests this one is still in an embryonic stage because there has 
been no follow-through in the collection and publication of validation 
data and vocational norms. It has been widely used since World War II 
in work with veterans, because its grade norms for specialized \ocabu- 
laries have made easier the evaluation of the readiness to resume a high 
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.school or college education of somewhat mature young men whose 
education had been interrupted. Clients whose informal education has 
given them much of the vocabulary of a special field can be assumed 
(theie is no actual published evidence) to have some* of the prcrccpiisites 
of success in that field. Hie usefulness of the Michigan Vocabulary Pro¬ 
file Test will probably be limited to such cases, and to the diagnosis of 
reasons for failure in educational programs, until moie complete valida¬ 
tion has been carried through. 

Trends in AVre Measures of Interests 

Although the discussion of the widely used measures of interest which 
constitutes the body of this chapter has brought out main of the impor- 
tant trends in interest test construction, there aie certain other develop¬ 
ments which ate not made clear by work with these insti uments. Foi 
this reason developments with some less widely used tests, some- of them 
not available for general use, are briefly considered in closing this chap¬ 
ter. 

The use of simpler and more familiar items, describing or pertaining 
to activities which have been almost certainly within sight and leach of 
the subjects lor whom the instrument is designed, is one tiend which 
seems clear in recent interest inventories. We have seen that the Lec- 
Thorpe inventory succeeded in keeping a 6th gtade vocabulary level. 
The Dunlap Academic P) eference Plank (Wot Id Book Co., 1939) was 
developed lor use in grades six through nine, utilizing vocabulary items 
ielated to the subject matter of those grades and familiar to pupils 
through their studies (219,220,713); it vields scores for degree of interest 
in literature, geography, arithmetic, history and other subject aieas, 
plus measures of mental ability. The Gregory Academia Interest Jnt>en- 
lory (Sheiidan Supply Co., 1917) is a somewhat similar inventory, based 
on liking lor high school subjects and activities (312), and designed to 
help college students in the selection of challenging curricula. An 
Activities Interest Inventory on which T. L. Kelley has worked lor some 
years (562) attempts to tap only activities with which the typical respond¬ 
ent (high school south and wartime Army enlisted men in some of the 
basic studies) is familiar without occupational experience and to use only 
terms easily understood by him. In so far as it insuies compichcnsion 
by the subject and uniformity of interpretation this is a highly desirable 
trend; but if, as it seems may have been the case with the Kuder, this 
increases the transparency of the inventory to the point of risking its 
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essential validity as a measure of underlying interests, this would be 
unfortunate. This is not a necessary resultant, however, as it should 
be possible to locate an ample number of items the meaning of which 
is dear to the subjec ts for whom the inventory is intended, but the occu¬ 
pational significance of which is hidden. Strong’s Vocational Interest 
Blank seems to contain a number of these. 

The measurement of factors appears to be favored in the development 
of new inventories, rather than the measurement of interests peculiar to 
specified occupations. To some extent this is a reflection of current in- 
teiest in factor analysis, and perhaps even of a realization of the contri¬ 
bution which (actor analysis can make to the purification of measures 
and the improvement of predictions, as pointed out by Guilford (31 (i, 
317). But as most of the imemories which measure tvpes of interests 
(inletest "factors”) ha\e arrived at these by methods other than factor 
analysis, however legitimate, and have not developed occupational norms 
to serve as a guide in the interpretation of the factor scores (Kudei shows 
signs of be coming a notable exception to this generalization), one is in- 
dined to suspect that the trend is in part the result of a tendency to 
choose* the easy and the* shot t way, to rely on a print i or at best internal 
indices of occupational significance rather than on external criteria. Test 
constructors and users should therefore be wary of the interest inventory 
which measures types of interests without providing objective evidence 
of the occupational significance of these interest factors. 

In format ion tests of nitnests are again gaining favor, as factor anahsis 
and related internal-consistency and item-validation technicjues are 
making it possible to construct instruments which measure information 
important to a variety of fields in a reasonable length of time. E01 ex¬ 
ample, it takes the O’Rourke Mechanical Aptitude lest, one of the 
first information tests and interest and aptitude, nearly one hom to 
measure mechanical information, whereas the Air Forces’ General In¬ 
formation Test assessed interests of differential significance for success 
as bombardier, navigator, and pilot in no greater length of time. A lew 
words on the nature of the instruments may be wor thwhile, in order to 
make clearer the direction developments may take. 

The AAF General Information Test had five antecedants, a Technical 
Vocabulary Information Test developed by R. N. Hobbs and J. W. 

I hatcher (316:350-358), a Sports and Hobbies Partic ipation Test de¬ 
vised by R. R. Blake and the writer (316:343-350), a Flying Information 
lest developed by the writer as a sub-test of the above (316:361), a Me- 
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chanical Information Test constructed by P. C. Davis and L. Hutchinson 
(316:323-327), and miscellaneous technical vocabulary items developed 
by F. B. Davis (316:361), all based on somewhat similar principles but 
focusing on different kinds of content, as the titles indicate. The sports 
and hobbies test, for example, included items pertaining to driving a 
car, basketball, diving, hunting, building model planes, playing poker, 
motorcycling, and woodworking as active masculine avocations, and 
reading, music, etc., as sedentary, feminine activities. A sample item is: 

To “draw,” a pool player hits the cue ball 

A at the right. 

B at the left. 

C high. 

D low. 

E don't know. 

These and other items were selected on the basis of several hypotheses: 
1) successful pilots, navigators, and bombardiers are differentiated by 
their personality traits and interests (e.g., masculinity-femininity); 2) 
these traits manifest themselves in interest and participation in some 
activities and lack of interest and participation in others; 3) interest 
and participation result in the acquisition of specialized knowledge not 
acquired by others. Particularly in the tests developed by or under the 
supervision of Blake and the writer, it was assumed that information 
which could be acquired only through participation, as opposed to 
observation, would differentiate most clearly the interested from the 
uninterested. The activities or fields of knowledge tapped by the various 
information tests were selected on the basis of expected iclaLionships 
between personality traits, interests, activities, and success in the three 
air crew jobs. The items ol each of the tests mentioned were selected 
first on the basis of internal consistency, then on the basis of validity 
(correlation with success in training). Only valid items weie retained 
and incorporated in the General Information Test. The validities of the 
antecedent tests for success in primary flying training (biserials r’s with 
graduation-elimination) are given in Table 34, together with those for 
both the final form of the General Information Test for primary flying 
and, for an unselectcd experimental group, for both primary and all 
levels of flying training. 

The substantially higher validities for the experimental group can be 
explained at least partly by the unselected nature of the sample, for these 
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aviation cadets were sent to training tegardless of scores on the various 
psychological tests used by the Air Forte in order to obtain true indices 
of test validity. The range of abilities being less restricted, the true rela¬ 
tionships were revealed. 

It is interesting to contrast the approach of these information tests with 
that of the Michigan Vocabulary Profile lest. While the latter used 
internal consistency as its criterion of item inclusion, and then proceeded 
tentatively to establish patterns of interest factor scores lor various cur¬ 
ricular groups, the Air Force information tests used factorial hypotheses 
as a basis for writing items, but included items in the scoring keys onl\ 
as they proved to have individual validities for occupational prediction. 


Table 34 

VALIDITY OF INFORMATION TESTS OF INTERESTS FOR PILOT TRAINING 


Test 

Sports and Hobbies Participation Test 
Flying Information subtest 
Auto Driving subtest 
Hunting subtest 
Music subtest 

Reading (literature) subtest 
Technical Vocabulary Information Test 
Mechanical Information Test 
General Information Test 
General Information'lest (214:191) 

General Information Test (214:191) 


N 

r 

Criterion 

432-501 

.30 to .36 

Primary School 

374 - 5*8 

.32 to .34 


37 i 

•36 

“ “ 

486 

.14 

“ “ 

118 

-.18 


287 

-.14 

“ “ 

3 * 5 * 

.17 

U 

5 J 3 - 3 i 5 i 

.23 to .32 


406-3146 

. 1 7 to .2 r 


131 * 

■46 

Experimental 



group. Ptimarv. 

1311 

• 5 i 

Expei i mental 



group all 



schools. 


This was done in order to put valid tests into use at the earliest possible 
date. The next step was a factor analysis of the tests to reveal what factors 
are measured and how unique they are; as Guilford has shown (316:817. 
830-831), the information tests did measure a pilot-interest factor. The 
next step would be to break this factor down by developing tests or sub¬ 
tests in which the items of the general inlormation test are grouped ac¬ 
cording to hypotheses concerning the primary interest factors constituting 
the global pilot-interest factor, checking these for internal consistencv 
and independence, making another factor analysis, and, if the tests seem 
promising, validating these purer factorial measures. The first step in this 
last sequence was taken in the Flying Training Command at J. C. Flan 
agan’s instigation (316:673-680) but was interrupted by the decline in 
training activities. Work along these lines was resumed by F. B. Davis, 
J. C. Flanagan, and the writer (925:68-7.4) in the Personnel Disti ilmtion 
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Command in a study of combat leadership, the results of which were 
inconclusive but, insofar as they did reveal tendencies, showed that offi¬ 
cers promoted most frequently while in combat tended to be more femi¬ 
nine than those who were promoted less often, while those who were 
promoted less often in combat seemed to be more masculine than the 
frequently promoted group. This was exactly the opposite of the expected 
results, and the opposite of what inspection of significant items for the 
prediction of success in training had suggested. In training, it was the 
masculine, active, competitive, items on which the successful fliers did 
better than the failures. As the training data were clearly significant and 
the combat data highly tentative, the latter relationships obviously need 
confirmation. Also, the promotion criterion, although seemingly as good 
as any available, was shown in studies by J. P. Chaplin, H. D. Charncr, 
\V r . G. Mollcnkopf, and the writer (925:77-83) to be far from ideal as an 
index of success in combat flying. 

Although the wartime work of the Air Force with information tests of 
interest and personality factors was interrupted at the point described 
above, furthet work is being done both in and out of the services with 
these techniques. Thcv seem to the* write r, who may be a biased observe r 
in this instance, to be lull ol promise for the future. 



CHAPTER XIX 


PERSONALITY, ATTITUDES, 
AND TEMPERAMENT 


Nature and Development 

THE field of personality is one of the most popular, challenging, im¬ 
portant, and confused in contemporary psychology. It was neglected by 
psychologists in the infancy of that science, studied by psychiati ists and 
psychoanah sts who used uncontrolled clinical me thods, and then finally 
taken under-consideration by psyc hologists who possessed scientific meth¬ 
ods but too often lacked the orientation to pet sons as such which char¬ 
acterized the clinically trained medical men. It is therefore small wonder 
that the 1 ps\ehology of personal its lias been in a chaotic state. I he origin 
and development of the theories of personality which one encounters 
today are haidlv a topic for a book on the use of tests in vocational guid¬ 
ance* and selection; treatments of the subject which were current when 
most of the available tests and inventories of personality were being de¬ 
veloped will be found in psvchological woiks by Allpoi t (il>), Biown 
(iim), Shaffer (yocj), and Stagner (y.pj). Mmphv (551) has published a ic- 
cc* 111 comprehensive treatment of the subject, which he also dealt with 
earlier in his collaborative synthesis of work in experimental social 
psychology (r,r, 5). Hunt (391) has edited a geneiallv excellent and up-to- 
date* svmposiuni of encyclopedic dimensions and sc ope; the chapter on 
mventoiies is, howevei, unfortunately weak. But it is relevant to considei 
the* subject here from the point of view ol the vocational counseloi 01 
personnel officer, from the perspective of the user of personality tests lor 
vocational purposes. 

Definitions. Some psychologists like to consider the personality as a 
whole, to think of it as a global unit, complex in nature but nnanalv/able, 
a viewpoint often arrived at in the Gestaltist’s protest against the unduly 
atomistic approaches of some* Behaviorists. To the scientificallv minded 
person this point of view often seems mystical, vague, and of little value 
in practice. Another approach defines personality in terms of the icactions 

•181 
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aroused in others, as social stimulus value. To many psychologists this 
approach seems too limited in its empiricism, as it leaves the individual’s 
personality in other persons, whose reactions are not completely uniform. 
A third definition treats personality as a pattern of traits or ways of re¬ 
acting to external stimuli. Personality is then both analyzablc and unitary; 
the operationalism of this definition appeals to the scientist. The organ¬ 
ising or global approach to personality has something to conti ibute to 
this last viewpoint, for one can think of the individual as a more or less 
organized and integrated unit, and of the process of emotional develop¬ 
ment as one in which an attempt is made to organize a variety of i faction 
patterns or modes of behavior into an integrated, smoothly working 
whole. One in whom a degree of integration appropriate to the demands 
made upon him by society has taken place is an emotionally adjusted 
person, while one in whom the integration has not taken place to the 
extent required by the demands of the environment, or one in whom 
the integration has partly broken down because of demands with which 
he was not able to cope, is an emotionally maladjusted 01 disturbed 
person. 

Psychologists interested in vocational guidance and personnel work 
seem to have found the concept of personality as a patterning of traits 
most helpful in their work, for discussions ol emotional or personal ad¬ 
justment and of personality traits abound in the literature, and attempts 
to measure both general adjustment and specific traits and to ascertain 
their significance for vocational success have been numerous. In an other¬ 
wise excellent discussion Warren (911:1015) states that the vocational 
counselor is less concerned with the degree of integration achieved by the 
client than with the nature and degree of his specific characteristics, for 
these determine his adjustments to his environment. To the wider this 
seems to be too limited a view, for adjustment to the environment is 
partly a matter of adjustment to onesell, and adjustment to oneself is to 
a considerable extent a matter of the degiee to which the \aiious traits 
of one’s personality are integrated. In a well-integrated personality the 
various internal needs and reactions to the various external pressures are 
harmonious: the person is impelled, driven, or attracted in one general 
direction (minor needs and presses to the contrary being taken care of by 
the strongly integrated unit), and is therefore able to function effectively. 
In the unintegrated or disintegrated personality, on the other hand, the 
reaction patterns arc not harmonious, he is pulled and driven in various 
directions, there is internal conflict, and functioning in society is im- 
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paired. The vocational counselor and psychologist, and the personnel 
man who wants an cfleclive employee, are therefore very much concerned 
with the degree and type of integration as well as with the specific traits 
which are organized into the whole. 

Role of Personality in Education and Occupation. Approaches to the 
study of the significance of personality and temperament traits for success 
and satisfaction in school and at work have generally followed one ol two 
patterns: 1) the clinical, in which case-history material is cited in order 
to illustrate dynamics and document (if not prove) a theory; or, 2) the 
psychometric, in which reliance has of necessity been placed upon the 
imperfect instruments available lor the measurement ol peisonalitv. In 
the former approach the findings prove little because of subjectivity and 
lack of controls, although they stimulate speculation; in the latter they 
prove little because of technical defects, although they do underline the 
need for better instruments. The end result is that our current knowledge 
of the role of personality in education and in work is impressionistic or, 
when quantitative, superficial. It has been shown by surveys of employ¬ 
ment records, for example, that personality problems are the most 
common cause of discharge from employment (118,390). Case studies 
demonstrate that difficulties in learning to read arc often caused by prob¬ 
lems of parent-child relations, and observation led to the suggestion that 
some people considering engaging in social work are motivated by an 
unconscious desire to sohe their own problems rather than to help sol\e 
those ol others. But none of these studies have yielded data which would 
enable one either to measure the extent and natuie of the characteristics 
invoiced, or to predict their interference or noninterference with success in 
any specific tvpe of educational or vocational endeavor. The conviction 
of their importance is strong and nearly unixersal, but the evidence is 
virtually lacking and the means of measuring the characteristics are sadly 
defective. It is only for \alues and interests that techniques have been 
more adequate and results more conclusive; these have been discussed 
elsewhere. 

One reason for the lack of adequate objective evidence on the \oca- 
tional and educational significance of personality traits is that students 
of vocational and educational adjustment have generally been specialists, 
not in personality, but in management, aptitudes, or instruction, while 
students of personality have generally been interested, not in \orations 
or in education, but in psychological theory or in clinical diagnosis. Some 
of the personality inventories (e.g.. Bell, Bernreutcr) are an exception to 
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this rule, but they stiller Irom the delects ol the inventoiy technique, 
which are most serious in the held oi personality; the more penetrating 
instruments (e.g., Minnesota Multipha.sk, Rorschach, Thematic Apper¬ 
ception Test) were dexised lor the study ol personality organization or for 
the diagnosis of emotional disturbances. For our purposes, what is needed 
is a penetrating measure applicable and applied to occupational rather 
than to hospitalized populations. 

In view ol the lack of sufficient objective evidence for a practically 
useful discussion of personality and vocational success, the results ol what 
studies have been made will be teserved for the sections dealing with 
specific instruments. Some comments are, howexer, called for in explana¬ 
tion of the lailuie to find clear-cut 1 elationships between personality and 
occupations in the few studies which haxe been made with the more 
penetrating tests. 

Although it has been assumed that there should be linear correlations 
between certain personality traits and success in some occupations, for 
example social dominance and selling, submissixcness and bookkeeping, 
introversion and research or writing, such relationships haxe in fact been 
louncl in verx lew occupations: a somewhat higher degree of dominance 
has been i out id in salesmen than in clerical workets (587,201), but other¬ 
wise iex\ r significant differences haxe been reported. T he lact that some 
significant dillciences do exist, and that some personality measures do 
haxe* a degree ol clinical xaliditx, suggests that the general faihue to find 
occupational personality patterns may be because personality is not re¬ 
lated to occupational choice and success in the commonly expected man¬ 
ner. Even in an occupation such as bookkeeping, a dominant indixidual 
may find outlets through advancement into supervisory and managerial 
positions; reseaich may accommodate extroxerts as well as introverts, for 
example, in sociological field studies, industrial chemistry, and the super¬ 
vision oi piojecLs; and the literary extrovert may find outlets in public 
relations work, some forms ot advertising and radio, or exen fiction 
writing when formulas rather than creative imagination and insight arc 
required. A lawyer may be a bookworm or a dramatist, a scholar or a 
promoter; a caipentcr can xvoik in morose silence, or exchange remarks 
and jibes xvith associates and passers-by between blows of his hammer; 
a packer may daydream or talk about the movies and the neighbors while 
placing batteries in cartons. Roe’s stimulating exploratory studies seem to 
confirm this hypothesis for artists (('>35) but to contradict it lor paleon¬ 
tologists ((FjG). 



PERSONALITY, ATTITUDES, AND TEMPERAMENT -185 

But if personality traits and temperament are not generally related to 
occupational choice or success, how, if at all, do they play a part in 
vocations? If the hypothetical examples given above are indeed valid, 
then personality as defined in this discussion determines the kinds of 
adjustment problems which the worker will encounter in any occupation 
he enters. If he is outgoing and his associates withdrawn he will have one 
kind of difficulty, but it may be solved by changing associates lather than 
(hanging occupations; il he likes sedentary mental work lather than 
active contact work he may be a writer of books on his research rather 
than a promoter of the financing of more research or the administrator 
of a research project; if he is socially dominant the assembly worker may 
be the social leader or the thorn in the flesh of his fellows, rather than a 
follower err isolate in the group. They will all be happy or unhappy in 
theii work, depending upon the ease with which they make the modifica¬ 
tions which it requires in their modes ol behavior. That such modifica¬ 
tions are indeed made has been demonstrated not only with nurseiy 
school children by Page (584) and Jack (395), but also with college stu¬ 
dents fry McLaughlin (498); although these studies did not demonstrate 1 
that the underlying traits were modified, they did shore that the surface 
mocks of adjustment were changed in ways which made the persons 
concerned function more effectiwly in their social groups. Since person¬ 
ality traits have been defined as modes of behavior, they may be said to 
have been modified. 

II one were to ask, then, why bother to measure personality and tem¬ 
perament traits in personnel and vocational guidance work, there 1 are 
two answers. First, a poorh integrated personality (poor general adjust¬ 
ment) may have trouble adjusting in any training or work situation, and 
should either be screened out or gi\en professional assistance in solving 
his emotional problems. Second, a person with traits which are likeh 
to make for adjustment difficulties in certain types of positions may be 
placed in a situation which is so structured as to turn his liabilities into 
assets or at least to minimize the chances of difficulty, he may be gi\en 
psychotherapy to modify his personality in such a way as to facilitate 
adjustment; or environmental methods may be used to develop new 
modes of behavior which are more effective. Many instances of maladjust¬ 
ment which appear at first to be vocational prove, after more careful 
examination, to be deep-rooted in the personality (257,442). When this is 
true, treatment by changing work situations or by on-the-job counseling 
may be necessary. The reason for making a personality diagnosis in 
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\ocational guidance and personnel work is, then, to screen problem 

'"ases and to assist in the making of more effective adjustments. 

Measures of Personality 

Until about 1935 only two types of instruments for measuring person¬ 
ality and temperament traits were widely used in the United .States: 
rating scales and inventories. These were both first put into extensive use 
and popularized during World War 1 , when Woodworth developed his 
Personal Data Sheet and various Army rating scales were experimented 
with; the details have frequently been written up, and will be found in 
Symonds (810). By 1935 several hundred personality inventories had been 
developed, but very few of them had been systematically studied after 
their first tentative launching, and the sophisticated segment of the test¬ 
using public had become wary of them (794). Rating scales also had 
proved disappointingly unreliable and invalid, but like personality in¬ 
ventories they were still used in many places either because the users 
were not fully aware ol their limitations or, more oltcn perhaps, because 
there seemed to be nothing better to use. 

In the thirties, howe\er, another type of personality measure was in¬ 
troduced to the United States with the development of interest in the 
Rorschach Psuhodiagnostik (f>q.j), a series ol inkblots Inst devised as a 
projective technique by a Swiss psychiatrist by that name, and with the 
publication bv Murray (557) of the Thematic Apperception Test, a scries 
of scmistructured pictures concerning which the subject makes up stories. 
In these as in other projective techniques the examinee is presented with 
an ill-defined situation (inkblot, clouds, collection of toys, clay, or am¬ 
biguous pictures) and permitted to make what he will ol it; the tendency 
is to structure it according to his own needs, thus revealing his person¬ 
ality traits unbeknownst to himself. The clinician must then draw upon 
his own skill and insight to tease out the meaning of the figures, objects, 
scenes, or stories constructed by the examinee. Although methods have 
been devised for obtaining seemingly quantitative scores from some of 
these tests, they arc still essentially clinical techniques, rather than tests. 
The fact that they appear to be more penetrating than personality inven¬ 
tories and have captured the interest of clinicians and researchers sug¬ 
gests that they will in time be greatly improved and transformed into 
more objectively scorable tests, but for the time at least they arc limited 
to clinical use. During World War II interest was revived in other little- 
used projective techniques, one adapted froin a type of intelligence test 
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item: the incomplete sentences test and the unstructured situation test. 
These are still in experimental stages. 

In selecting specific tests for discussion in this chapter choices are 
limited to two types of instruments, personality inventories and projec¬ 
tive tests, neither of which is presently very satisfactory or valuable to the 
vocational counselor or personnel man, and only one of which is of much 
value to the vocational psychologist. Rating scales are not discussed, as 
they are filled out by persons other than the examinee and are dealt with 
in other texts (538,768). Space is devoted to inventories and projective 
tests, however, lor two reasons: 1) increasing use is being made of both 
types ol measures in both personnel work and vocational counseling 
despile widespread disillusionment with one type and skepticism regard¬ 
ing the other; and, 2) workers in the field need to know what has been 
done and is being done in the field of personality measurement, so that 
they may handle inquiries and take advantage of progress as it is made. 
Personality tests and inventories are intriguing; it is well for the potential 
user to know the nature of their limitations in some detail. 

Two personality inventories aie dealt with in some detail: one, the 
Bern renter Personality Inventory , because there is more published evi¬ 
dence concerning it than concerning any other inventory and because it 
is typical of many; the othei, the Minnesota Multi phasic Personality 
Inventoiy , because it represents a somewhat different approach and has 
come into wide use and popular favor among psychologists. Other inven¬ 
tories discussed more briefly are the widely used Bell Adjustment In¬ 
ventory and the carefully constructed but new and less studied Minnesota 
Personality Scale. One personality inventory developed for use in the 
wartime Army Air Force and no longer usable, the Satisfaction Test . is 
briefly described because ol its implications for work with inventories in 
selection ptograms. Many other inventoiies might be commented on, but 
the discussion of the above-named instruments should help the reader to 
examine critically the sweeping claims often made by publishers and 
authors. 

Two graphic projective technicpies are treated in some detail, from the 
vocational counseling and selection point of view: these are the Rorsi fi¬ 
nch Inkblots and the Murray Thematic Apperception Test, both because 
of the widespread interest in them and because they are now being used 
in occupational research. Two series of projective situation tests are 
described much more briefly, because of their possible significance for 
future work: the series used bv the Office of Strategic Services, and one 
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experimented with in the CAinical Techniques Project of the Army Air 
Force. Finally, work with the Incomplete Sentences Technique is briefly 
discussed for the same reason. 

The Bernrenter Personality Inventory (Stanford University Press, 1931) 

This personality inventory was based on earlier work done by Wood- 
worth, Thurstone, Laird, and Allport, its principal contribution being 
its success in combining the items from several personality scale's in one 
blank. Although a commonplace today, this was then a time- and mate- 
rials-saving novelty, and probably did more than anything else to give 
the inventory its widespread popularity. When the studies published 
prior to 1941 were reviewed (794), the aggregate totaled 134; many more 1 
have since been published. As the objective in this discussion is relevance 
to vocational guidance and personnel work rather than completeness no 
count has been made for the succeeding years: those utilized alone' 
amount to 27. The inventory is clearly still popular and widely used, 
despite a great deal of criticism. 

Applicability. The Bernreuter Personality Inventory was designed foi 
use with adolescents and adults. Nothing has been found in the liteiatme 
or encountered in counseling piactice which suggests that the voeabulaiv 
and experiences sampled aie inappropriate to those* age levels. 

Age does not affect seores in relatively homogeneous populations Midi 
as those studied by Berm enter (87), Carter (143), and Miles (328), al¬ 
though in more heterogeneous groups self-sufficiency and dominance 
seem to increase with age. 

It has been demonstrated that experiences planned to modify person¬ 
ality traits affect some types of scores on the Bernreuter ((>4(1,882). In the 
latter study the results may have been vitiated by training in the signifi¬ 
cance of behavior such as that described in the items of the inventor), 
for the experience was a course in applied psychology; Hartmann’s find¬ 
ings (347) support this interpretation. In the former the findings ate 
more convincing, for the cxpeiience consisted of speech training provided 
for experimental but not for control gioups, and the numbers were huge. 

The effect of rapport has been investigated in a number of studies 
using the Bernreuter, the nature of the findings depending, as might be 
expected, upon the design of the experiment and the phrasing of direc¬ 
tions. Bernreuter (8f>) administered the inventory to students tinclci 
normal conditions, then readministered it with instructions to answer 
it: a) “as you would like to be,” and b) “as you think you ought to be.” 
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He found no significant differences, from which fact lie concluded that 
(he desire for social approval does not appreciably affect scores. When 
somewhat different directions have been used, however, distortion of 
scores has been shown: Olson (575) quotes an unpublished paper by 
Hendrickson which demonstrated that teachers retested with insti uctions 
to answer as though applying for a job made significantly more stable, 
dominant, extroverted, and self-sufficient scores than when answering 
noimally; Rucli (b^b) found that college students raised tlieir aveiage 
extroversion percentile from the 50th to the pHth when asked to fake 
extroversion on a tetest; and Eosberg (270) found that subjects instructed 
to make a good and then a bad impression on second and third testings 
succeeded in influencing their scores in the desired directions. As the 
instructions in the last three experiments aic more appropriate for test¬ 
ing the effect of conscious desire to fake than were Bernreuter’s, it ma\ 
be concluded that the desire to make a good impression, when it exists, 
does a fled scores. Bernreuter’s dilections and results do seem to warrant 
the conclusion that there is little if any disparity between the responses 
of persons in a none valuath e situation (e.g., students who know’ they will 
be marked on the basis of their achievement rather than on their per¬ 
sonality imentory scenes) and their responses when asked to reply in 
let ms of their ideal sehes; this only proves that the self-concepts of stu¬ 
dents cliffci little horn then self-ideals. 

Mood might also be expected to affect scores on a self-descriptive scale 
such as the Berm enter, but only two studies bear on that question. One 
is Johnson’s (job) compaiison of the scores of 15 college women tested 
in peiiods of mild elation and again in periods of mild depression, in 
which only some diflcicnees approached significance, low r moods being 
accompanied In slight shifts toward neuroticism, dependence, and sub- 
missi\cncss. Johnson attributed the lack of significant differences to the 
iicc/ing of responses once gi\en to the Bcrnrcuter items. The case of a 
suicide was jepoiicd by Farnsworth and Ferguson (217); His neuroticism 
scoie changed from the 50th pci cent ile 15 months, to the 8grd thicc 
months, before suicide. Although the findings are by no means condusiu*, 
the indications are that normal mood changes have no great effect on 
Berm cuter scores, while abnormal do. 

Content. The Personality Imentory consists of 125 questions based 
on those used in earlier inventories, such as: “Are your feelings easily 
hurt?” Answers are recorded on the blank, in terms of “ves,” “no,” and 
“?” There are few extreme or potentially offensive items, making the 
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inventory acceptable to most groups; with groups of adolescents, how¬ 
ever, it is desirable to minimize opportunities for laughter and joking 
by businesslike administration and good proctoring. 

Administration and Scoring. The inventory is self-administering, with 
no set time limit, and takes from 20 to 30 minutes. Examinees sometimes 
ask what is meant by a question, for definitions of terms such as “fre¬ 
quently”; although on the face of them these questions may seem war¬ 
ranted, the examiner must be careful to explain only unfamiliar terms, 
and to leave the interpretation of others to the examinee, as therein lies 
part of the significance of the test. It is not so much the facts which 
matter in a personality inventory designed for the normal range of per¬ 
sonalities, as the subject’s attitude toward those facts; to make this con¬ 
crete, it is not the actual number of times he has fainted that matters, as 
his feeling that he is or is not given to fainting. 

Scoring stencils are provided for neuroticism (Bi-N), self-sufficiency 
(B2-5), introversion (B3-I), dominance (B4-D), self-confidence (Fi-C), and 
solitariness (F2-5), with weights ranging from 7 to —7 assigned to each 
item according to its diagnostic value. These weights were determined by 
relationship to the parent inventories. Various brief scoring methods 
have been devised (18). 

Norms. These are provided on high school, college, and adult popula¬ 
tions, gradations which are sufficiently refined as shown by studies of age 
differences. The adequacy of the norms has been shown by several in¬ 
vestigators (576,587,742,761) although some working with special popula¬ 
tions have disagreed (948). 

Standaidilation and Initial Validation. Many of the items in the 
Personality Inventory were taken from the earlier blanks on which it 
was patterned; criterion groups selected on the basis of high and low 
scores on these other forms were then tested with the Bernreutei, and 
weights were assigned accordingly. The correlations of Bernreuter’s 
scales with the originals ranged from .67 to .qp as might be expected 
in view of the method of development. This proved little concerning the 
validity of the inventory, as it depended upon the validity of the not- 
well-validated parent forms; but it did demonstrate what Bernreutei set 
out to prove, that one personality inventory could do the work of four. 
It remained for subsequent studies, which Bcrnreuter himself failed to 
;nakc, to establish the validity or invalidity of the instrument by the 
use of external criteria. 

Reliability. The reliability studies have been numerous and are sum- 
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marized elsewhere (794); it need only be stated here that they have gen¬ 
erally been found to be above .70 and often above .80, except after the 
lapse of substantial periods of time. Whether the changes in scores which 
take place with time are the result of defects in the inventory or of 
changes in the subjects is not known. 

Validity. The validity of a personality inventory for use in vocational 
guidance and personnel selection or evaluation must be considered from 
two points of view: first, its value in screening maladjusted individuals 
who need psychotherapy or who should be rejected as employment appli¬ 
cants; and, second, its usefulness in predicting success and satisfaction in 
training and in various types of work. Basic to this second purpose is 
another purpose, that of measuring identifiable traits which may be jo- 
lated to success and satisfaction. They have also a third possible purpose, 
namely to assist in diagnosing the nature of a maladjustment, but that is 
one which concerns psychotherapists, rather than vocational counselois 
and personnel men. The material discussed below has been selected and 
discussed with these distinctions in mind. 

The items in the Bernreuter were chosen on a priori grounds, 011 the 
basis, that is, of their diagnostic significance as seen in clinic al experi¬ 
ence. They were validated by internal consistency, and named on the 
basis of an examination of their nature; thus one of his scales seemed to 
Bernreuter to measure autistic: thinking, introspection, and other types 
of behavior wan anting the name introversion (87). This procedure was 
criticized by Landis (152) as unsound because not empirical; he and Katz 
found that, although three-fourths of the self-descriptive responses of 
psychiatricallv diagnosed neurotics agreed with objectively determined 
facts (451), some items are answered contrary to expectation (452). Move 
normals than abnormals in their sample reported daydreaming tend¬ 
encies, ideas running through their heads, etc. The empirical approaih 
was recommended, with items weighted on the basis of group differences 
(as in Strong’s Blank) rather than a priori grounds. 

But Landis and Katz failed to take into account the important fact 
that Bernreuter had empirical evidence to justify his item weights, in 
the form of internal consistency data. They therefore made no attempt 
to rationalize their findings with his, although both must be accepted. 
This can be done by referring to the nature of the populations woikcd 
with: Bernreuter’s groups were college students, high- and low-scoring 
normals, while Landis’ and Katz’ were normals on the one hand and 
abnormals (neurotics and psychotics) on the other. In other words, two 
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diflerent types of “abnormals” were used, one in touch with reality, the 
other somewhat out of touch. It is to he expected that the responses of 
these two groups would differ, for their degrees of contact with reality 
and their de fense mechanisms are by no means the same. Poorly adjusted 
normals may admit daydreaming more than well-adjusted normals, even 
though abnormal persons admit such behavior less often—they may not 
recognize it as daydreaming. Two different sets of scoring keys may there¬ 
fore be needed, one for more or less normal persons, and another for 
more seriously disturbed persons. Bernreuter *s scales were developed for 
use with normal subjects. 

The screening of maladjusted persons with this inventory has been 
studied by a number of authors, whose findings for ps\chotics and neu¬ 
rotics have been summarized as follows: “W hen the data are examined in 
detail, the\ do appear to reveal differences between noimal and \aiious 
groups of abnormal individuals, e\en though these differences are not so 
clear-cut as one would wish . . . unfavorable scores do tend to ha\c 
significance, although faxotable scores are not necessarilv a sign of good 
adjustment” (79.j: too). Since the aboxe summary two other studies have 
been published with military subjects. Schmidt and Biilingslea (('>77) 
found that although onlv the socialdominance scale dear lx diffe renti¬ 
ated 729 psychopathic and neurotic from 97 normal soldiers, the pattern¬ 
ing of scores on the Bernreuter scales was 80 percent effective in diflercn- 
tiating them. Page* (782) found highly significant differences in the mean 
neuroticism scores of large groups of medically diagnosed psy< honc ui01 ic 
and normal soldiers at Camp Lee. These findings are in accord with a 
general tendency for personality inventories to be more valid irr wartime' 
military situations than in civilian life, a phenomenon which needs more 
study but which may be due to the* fact that maladjustment in the armed 
forces is in a sense rewarded by escape fiom danger, and adjustment is 
in the same* sense punished by the threat of death, whereas in civilian life 
the rewards go to the well-adjusted. 

Certain other types of piohlcm grou])s have sometimes not been so 
well differentiated by the Bernreuter: unmarried mothers did not differ 
from controls (770), and problem children at Mooseheart made score's 
comparable to those of others (732). But prison inmates have been found 
more neurotic than normals (781,172), Hargan’s (771) contrary findings 
suggesting that traits may differ with types of crime. Students corning to 
a college clinic for psychological help have been found more neurotic 
than others (761,66.4); college cheaters were more neurotic and dependent 
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than others (132,163); and the unhappily married were found to be more 
neurotic than the happily married (407,85). 

The recognition of potential leaders by socially desirable scores on the 
Penneiiler has been shown to be possible in a number of studies. Stu¬ 
dents who earn part of their expenses in college have been found more 
self-sufficient and dominant than others (664,101), fraternity members 
more stable, dependent, and dominant (101). Campus leaders haw gen¬ 
erally been found more dominant, self-sufficient, and stable than other 
students (664,394,631). 

Ratings have been related to Rernreutcr scores in a number of studies, 
paiticulaily with college students as subjects. A leview of eaily studies 
(794 :, °9) ^henvs that these generally agree moderately well, the modal r 
being about .30. I wo more-recent studies (94 j) found 110 relationship, 
however, suggesting that little weight can be given to validity studies 
based on ratings. Roth self-ratings and the ratings of others piesumably 
have validity of a type, lor one represents the subjec t-as-seen-bv-himself, 
the other the* subjcct-as-scen-bv-others; even though the two images may 
not lesemble each other, they aie both important in the clinical study of 
an indiv idual. 

Objective tests of intelligence have generally been found unrelated to 
Rernreuter scores (791*106). but stability, extroversion, and self-sufficiency 
have been found related to persistence test scores (661). and introversion 
to Rorschach-tested introversiveness to the extent of .78, affective stability 
to emotional stability .52 (892), findings partially confirmed in factor 
analysis of 1 he two tests ( 17 f>)- 

G rades have been used as a criterion with which to correlate Rern¬ 
reuter scores in a number of studies elsewhere summarized (79 j: 109), the 
general trend being lor the relationships to be practically nonexistent. 
In only one of tlu* e ight studies published prior to 19 p was any relation 
ship found; in it, Neel and Mathews (565) reported that high-achieving 
students of superior mental ability were more introverted, self-sulhc ient. 
and solitary than low-achieving students of the same mental level. The 
more iefined approach to the problem used in this study seems to j 11stily 
StagneTs statement (712) that personality affects scholastic achievement 
by influencing the use made of one’s abilities and, therefore, does not 
yield a linear correlation with achievement. More recent investigations 
have been published by Bennett and Gordon (71), and by Sartain (670), 
with nursing students as subjects, by Rryan (123) with art school students, 
and by Zelman (953) with general college students. The first investigation 
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iound little relationship, as determined by critical ratios, between nurs¬ 
ing supervisors’ ratings and Bernrcuter scores, but Sartain reported 
correlations between grades and self-sufficiency, and between grades and 
social dominance, of .29 and .26. He discarded these as being of little 
value, and with only 81 cases the correlations are not reliable, but they 
do suggest that the inventory might contribute something unique to 
predictions normally based only upon intelligence and achievement. The 
other two studies showed insignificant relationships, confirming the com¬ 
mon findings when intellectually heterogeneous groups are used. It 
seems clear that, if personality inventories are to be used in educational 
guidance, it should only be for the study of special groups such as under¬ 
achievers. 

Success on the job has been correlated with Bernrcuter scores in 1 da¬ 
tively few studies, most of them fairly recent. Thirty loremen and assist¬ 
ant foremen were tested with the Bernreuter, Bennett Mechanical 
Comprehension, and Strong tests by Schultz and Barnabas (682), their 
criterion being combined ratings of budget control efficiency and em¬ 
ployee relations. T he combined scores of the tests had a validity of .52; 
the Bernreuter \alidity was .‘}fi> (onlv the one unspecified scale was used). 
A somewhat similat group of jo foremen in an aircraft factory was tested 
by Sat tain (fijt), again with ratings as the criterion. These had an inter- 
form reliability of .79, the validity of the predictors ranging from .01 
(self-confidence) to .12 (social dominance), all of them too low to be 
reliable. Similar data for a group of 85 foremen yielded no better results; 
when 59 other foremen were classified as “good” or “poor” the difference 
in Bernreutei scoies as not significant. Empirical keys for pilots developed 
in a study of aviation cadets initiated by the writer (316:588-589) had 
no validity for success in pilot training. 

Retail grocers, 70 in number, were rated according to credit and 
pecuniary strength on the basis of Dun and Bradstreet data by Hampton 
(327); these ctiteria yielded no correlations with Bernreuter scores ex¬ 
ceeding .16. In personal contact saleswoik as exemplified by casualty 
insurance salesmen, however, the relationships were higher: Bills and 
Ward (92) and Schultz (683) found that successful salesmen made more 
normal scores than did failing salesmen. Personality traits as measured 
by the Bernrcuter therefore seem, like interests, to affect vocational 
success when the congeniality of the work is of especial importance. 

Practice teaching has frequently been selected as an activity in which 
success might be expected to be related to personality traits as measured 
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by the Bernreuter. Cahoon (130), Sandiford (665), Laycock (455), and 
Ward and Kirk (908) found no relationships using correlation techniques 
or critical ratios, but when Laycock compared the top- and bottorn- 
quartile success groups there were great differences on all Bernreuter 
scales. This finding was confirmed for another group by Palmer (585), 
and Pintner (605) found that good student-examiners (Stanford-Binet) 
were more stable according to the Bernreuter than were poor individual 
testers. 

In the one study of teachers in regular job situations, Gotham (302) 
failed to find any relationship between Bernreuter scores and teacher 
success, but the criterion was so unique as to need further study itself 
The subjects were teachers in 72 rural schools, their success being judged 
by “pupil gains” or the improved performance of their pupils. In view 
of the many variables affecting learning, and the varying situations in 
which the teachers worked, the significance of pupil gains needs rnoie 
detailed scrutiny than can be given to it here. 

A group of bank cleiks weie tested by McMurry (199), who found 
slight negative correlations (—.27 to —.05 for three different groups) 
between neurotic tendencies and efficiency ratings, these scores adding 
so little to the predictive value of the Otis that the relationship seemed 
unimportant. 

It riia) perhaps be concluded, from the above studies, that peisonalitv 
traits as measured by the Bernreuter are not generally related to success 
on the job, except in activities such as outside sales work in which the 
congeniality of the activity has a very direct effect on the degree of the 
worker’s application. 

Sua(\ss in obtaining employment or in retaining a job was not related 
to Bermeuter scores in the studies of the Minnesota Employment Sta¬ 
bilization Research Institute (587), but Morton (5 1 5). Christensen (iyh. 
and La/arsfelcl and Gaudet (457) have with both adults and adolescents 
found differences between employed and unemployed, job-getters and 
the unplaced, which were significant. The employed tend to be more 
stable, more self-sufficient, and more dominant according to the Bei li¬ 
re u ter. 

Occupational differences in scores on this inventory were first studied 
in the Minnesota Employment Stabilization Research Institute, where 
social dominance tended to distinguish salespeople from workers in 
skilled, semiskilled, and unskilled occupations, and policemen tended to 
be more dominant, stable, and extroverted than otheis, but other e\ 
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pet ted differences were not found in a cross-section of employed workers. 
These trends were in general confirmed by Dodge (201,202) in New York 
City and Morton (545) in Montreal, both adding a few occupational 
differences: salespeople were somewhat more dominant than clerical 
workers, traveling salesmen than bookkeepers (201,202); accountants 
and salesmen were most dominant, self-sufficient, and stable, engineers 
and unskilled workers least so, while professional men and executives 
tended to be dominant, carpenteis and electiicians tended to be emo¬ 
tionally stable (54;,). Motion-picture writers studied by Metlessel (52(1) 
did not ditler in individual trails from the general population, but ex¬ 
amination of their average profiles showed that the patterning of then 
scores is not t\pical. Johnson (404) also found 150 salesmen to be domi¬ 
nant on the Bernreuter when compaied to the norm gioups; they were 
a homogeneous gioup in this lespect, unlike' an ecpialh huge gioup of 
seminal) students; McCaithv (J92) also studied seminal ians. finding that 
they tended to be somewhat unstable' and self-conscious when compared 
to the geneial population. But in all of these instances the o\ei lapping 
of groups was sea gieat as to make application of the* findings impiactical. 

Stability m an on a fiat inn and job sat isf action should have been com¬ 
mon subje c ts of stueh by means of personality in\e ntoi ie s sue 11 as the* 
Bernreuter, as it is commonly assumed and case* studies haye shown (277) 
that personal maladjustment often undeilic's yocational dissatisfaction 
and freeptent job changes. Only one such study has been located with this 
inventory, howe\ei; in it, Seagoe found no significant clifleiences 

between teachers who staved in that occupation and those who left it. 
although there was a tendency lot the we'll adjusted, and lor the malael 
justed of lower intelligence, to remain in teaching, and loi the malad 
justed of superior mental ability to gi\e up teaching, as though they 
had the- insight and the ability to leave an uncomloi table situation. Mote 
studies of this type seem desirable, to throw more light on the dynamics 
of vocational adjustment. 

Use of the Jicrnrruter Po sotiality InI'cntovy in ('.ounsrhni g and Seda¬ 
tion. There is some- danger, in a summary of this sort, that the disc ussions 
of group clifleiences which justify' statements such as ‘salesmen tend to 
be more dominant than clerical woikeis” will leave in the mind of the* 
reader the impression that a person making a high dominance* score on 
the Bernreuter might be a salesman, and that conversely a pci son mak¬ 
ing a low^ score would do we ll to avoid sale s woi k. It is well to remind the* 
reader that the existence' of gioup 11 ends h compatible with the finding 
of many individual exceptions: some* salesmen are not dominant, and 
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some dominant persons would not make success!til salesmen. When it is 
icmcmbercd that social dominance is just one characteristic olten found 
in good salesmen (in lile insurance they tend also to be over 35, married, 
fathers, to have bank accounts, and to carry insurance themselves) the 
reason is obvious. It is well to remember this in leading the following 
summary. 

Bi-N, the emotional instability or neurotic ism scale, appears to meas¬ 
ure emotional sensitivity. Low scores tend to indicate a wholesome extro¬ 
version, an ability to lace iacts and the enviionment objectively and to 
deal with them without internal conllict, whereas high scores suggest 
unwholesome introversion, pool adjustment to the environment and a 
tendency to withdraw horn it. A great variety of maladjusted people 
make high scores on this scale: neurotics, autistic schizophrenics, and 
depressed persons. Low scores are made by emotionally stable people, 
and by those who in different situations are aggressive lather than with- 
chawing, and also by leaders, fraternity members, the happilv mariied, 
the paranoid, manic individuals, and hyperthyroids. It has some occupa¬ 
tional significance, as shown in the tendency of the employed to be more 
stable than the uncmplovcd (this could be either cause or' effect), the 
superior stabilitv of policemen, accountants, salesmen, carpenters, and 
electricians, and the tendency of emotionally stable* teachers to stav in 
their field while* the able-unstable changed to other occupations. 

Bii-S, the self-sufficiency scale, probablv measures another tvpe of intro¬ 
version. Eire high scoring person tends to be self-sufficient, does not de¬ 
pend on others for advice and emotional support; he is not withdrawn so 
much as free from the necessity to advance, an introvert in the Jungian 
sense of the term. The low-scoring person is probably not an extrovut, 
however, in the usual sense, for this implies a wholesome turning to the 
environment whereas in such instances the* turning outward is the result 
of a need to depend upon the environment lot emotional support nor¬ 
mally lound within the sell. Low scores therefore probablv represent an 
unhealthy sort of extioversion, contrasted with the wholesome extrover¬ 
sion measured bv Bi-\. Maladjusted groups which tend to make high 
self-sufliciency scores include neurotics (the false scll-suffic ienev of com¬ 
pensatory fantasy?), withdrawing persons (for the same reason?), and 
divorcees; those making low scores include clibbers and epileptics. The 
occupational significance of this scale is indicated bv the high scenes made 
fry leaders and contact workers, and the low scores made* 1>\ those who 
work primarily with records or materials. 

B3-1 has been found to resemble Br-N to such a high degree (794:110) 
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as to justify not using it. The introversion-extroversion which it was de¬ 
signed to measure has, we have just seen, already been provided for. 

B4-D, the dominance-submissiveness scale, measures the tendency to 
dominate in face-to-face situations. It is apparently not a pure trait, but 
a combination of wholesome extroversion and sociability (794:110). Low' 
scores indicate submissivcncss, but high scores may be indications of the 
conviction that one should seem dominant rather than of a tendency to 
take the initiative in social situations. Problem individuals who tend to 
make high scores seem to include only those who react aggressively to 
difficult situations (794:116), if these may indeed be called problem peo¬ 
ple; low r scores tend to be made by withdrawing persons and by otheis 
who have difficulty coping with the environment (794:116). The occupa¬ 
tional significance of the scale is shown by the superior dominance of the 
employed, salespeople, policemen, accountants, professional men, and 
executives, and the submissiveness of the unemployed, unskilled and 
semiskilled workers, clerical workers, and bookkeepers. 

The Fi-C and F2-S scales, developed by Flanagan on the basis of fac¬ 
tor anal)sis (261), ha\e been respcctivclv show'll to have much the same 
significance as B1 -X and B2-S. 

In schools and colleges the Bernreutcr can be used with a fair degree ol 
confidence as a measure of group trends, and for the screening of problem 
individuals who are to be studied by more intensive methods. It is likeh 
to prove more helpful in survey testing than as part of a battery for in¬ 
tensive study of an individual. “Bad” scores, which are sometimes high 
and sometimes low’, can generally be assumed to have some significance, 
but “good” scores may be compensatory rather than the result of a 
wholesome adjustment. It should be more useful in educational institu¬ 
tions than in clinics or employment situations, because other methods of 
study suitable for clinical use should prove more penetrating in mental 
hygiene work, and because the desire to make a good impression can dis¬ 
tort scores when applying for employment. The item-validity is such as to 
make the inventory best for use with normals and near-normals, rather 
than w T ith psychotics. The inventory is of questionable value in detecting 
behavior-problem cases, as opposed to otherwise emotionally maladjusted 
persons. The use of the Bernreuter scores in counseling concerning voca¬ 
tional choice appears to be virtually limited to consideration of the sig¬ 
nificance of dominance scores for business contact occupations. When 
these are high, confirmation should be sought in extracurricular and 
leisure-time activities; when low, case history and cumulative record ma- 
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terial should also be examined in order to ascertain how the trait has 
affected social behavior, for some successful salesmen are not exactly 
dominant individuals, although these men may perhaps expend more 
energy in meeting the social demands of sales work than do more domi¬ 
nant persons. In any case in which abnormally high or low scores are 
made, the counselor should study cumulative record and interview data 
in order to undo stand the significance of the score for the person in 
(piestion, and if it indicates that the counsclee may have difficulty making 
adjustments which he is likely to be called upon to make, the counseloi 
should make an attempt to get him the needed therapeutic help. 

In guidance centers the use of this inventory may be similar to that in 
schools, especially if used routinely for survey testing and screening. With 
clients who come because they themselves feel the need for help it may 
provide one more kind of data concerning personality traits, to be \iewed 
in relation to other data, but it is not likely to help with those who come 
because they aie sent or who are referred for appraisal as possible employ 
ces. It has pro\ed of some \aue in selecting salesmen, and so may ha\c 
a place in personnel evaluation, whether because it portrays the appli¬ 
cant’s actual personality or because it shows how well he knows what a 
good salesman should be like and do; by and large, however, other meth¬ 
ods of personality study should be relied upon when referral 01 recom¬ 
mendation for employment is under consideration. 

In employment sendees and in bustness and industry inventories such 
as this aie not likely to prove satisfactory, because of their transpat¬ 
ency, except as pointed out in the preceding paragraph. The occupational 
differences which ha\e been obserxed with it were all detected, it should 
be remembered, in situations in which the examinee had little or nothing 
at stake. 

The Minnesota Multiphasic Personality Inventory (Unixeisity ol Minne¬ 
sota Press, 19-13: Psychological Corporation, lc^f,) 

This personality inventory was developed by Hathaway and McKin!e\ 
at the University of Minnesota as a clinical instrument lor use in psxcfn 
atric diagnosis (353). It was not intended as a test for use in educational 
and vocational counseling, or in personnel selection. Their purpose was 
to develop one personality inventory which would measure all aspects of 
personality which bear on psychiatric diagnosis, thus implementing 
Rosanoff’s theory of temperament (645). They wished to make more ob¬ 
jective the judgments that are reached in a clinical situation by pjoviding 
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more systematic coverage of behavior and attitude items than is generally 
possible in an interview. But there is evidence in many guidance centers 
of an interest in applying this instrument to vocational guidance and 
selection, apparently in the belief that since it is considered a better 
clinical inventory than most others on the market, it should also be a 
better vocational test. This is a nonsequitur, but it makes desirable some 
consideration of the test in this chapter. No attempt will be made to go 
into its clinical validity in any detail, as that is a long story the tangential 
relevance of which makes a mere summary suffice; what is known ol its 
vocational significance will be discussed at somewhat gieater length. 

Applicability. The Multiphasic was designed for use in mental hygiene 
and psychiatric clinics, with older adolescents and adults 'who ha\e had 
a few ycais or more of education. It has been administered to junior high 
school bo\s and girls, but according to the authors 101*0) it has not 

been \alidated at that age, for which many items might concei\abl\ ha\e 
quite dilleient significance. The authors repot t that several of the traits 
measured change within relatively short periods of time, as one would 
expect of depression and hvpomania, which ate altitudinal manife stations 
of one under lving tvpe ol tempet ament. Some of the other trails might be 
expected to be less subject to the* effects of experience and of mood, a^ in 
the case of masculinity and psychopathic deviation. Studies of the exte nt 
and natuie of such fluctuations have appatently not been published 
Content. The MMPI consists of 550 self-descriptive items such as aie 
found in the Kcrnreuter and in other personality inventories on which it 
was based. They are classified under 26 categories, ranging from general 
health through the gastiointestinal system, habits, family, occupation, sex, 
phobias, and moiale to items designed to show whether the examinee is 
trying to describe himself in improbably good terms. Some items such as 
the first two listed below are cpiite innocuous, while others, like the last 
four, are more likely to seem offensive: 

I like to read newspaper editorials. 

I hate* to have to rush when working. 

Someone has it in for me. 

Peculiar odors come to me at times. 

At times I feel like smashing things. 

There is something wrong with my mind. 


Administration and Scoring. There is no time limit, but testing 1101- 
mally takes from 30 to 90 minutes, depending on the education and ad- 
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justmcnt of the examinee. There are two forms of the test, one consisting 
of a set of cards administered individually and to be sorted into three 
stacks (T rue, False, Cannot Say), the other a booklet with IBM answer 
sheets. The test authors recommend the individual form, and Ellis 
(238:423) has suggested that it may he superior to the booklet form, but 
Wiener’s study of 200 veterans in a guidance center found 110 dillcrences 
in group trends on the two forms (924). 

The card test is recorded on special forms, and both are scored by means 
of stencils, or the booklet can be machine scored. Scoring may now be done 
lor nine* reaction patterns: hypochondriasis, depression, hysteria, psycho¬ 
pathic deviation, mastulinity-lemininity, paranoia, psy chasthenia, schi/o- 
plnenia, and hypomania. Others may be added. Four other scores 
(question, lie, validity, and a “suppressor variable”) are also available to 
aid in judging the meaning of the scores. It should be noted that although 
at least one of the traits may be thought of as one aspect of temperament 
(masculinity-femininity), two others seem to be mood-manifestations of 
another aspect ol temperament (hvpomania-dcpression), and still another 
may be the pathological extreme ol a personality trait (schizophrenia), 
the others are traits made up of modes ol behavior which arc not nor¬ 
mally considered as components ol the normal personality, but aie gener¬ 
ally thought ol as clinical syndromes or even disease entities. On logical 
grounds one might thcicloie question the soundness ol applying such 
measures to normally adjusted persons and drawing conclusions concern¬ 
ing occupational clilieiences, but to do so is consistent at least with 
Rosanofl’s theory of temperament ((145). This theory postulates three 
components, ol which the above psychotic tendencies are developments 
and on which this inventory is based. 

Nonns. T he standardization group consisted of about 700 men and 
women icpresentatives of the general Minnesota population in age and 
education, and not under medical care; 1101ms are based on hospitalized 
patients from each of the nine diagnostic categories, averaging about 50 
in number (manual). The development of norms lor psychiatric classifica¬ 
tions is difficult because of the impurity of cases in actual practice, and 
the consequent difficulty of classification in any one category; clinical 
users of a test such as this should examine published data on tire norm 
groups more carefully than is appropriate here. No occupational norms 
have been published, but data are given lor five small gioups of workers 
in as many occupations in two published studies (169,893) discussed 
below. 
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Standardization and Initial Validation. In attempting to dcxclop a 
measure of Rosanoff’s temperament components Hathaway and McKinley 
relied partly on the only existing inventory which had the same purpose, 
the llumm-Wadsworth Temperament Scale. The data concerning this 
scale have been so uniformily favorable when published by the scale’s 
authors or persons wot king under their auspices (365,387,388) and so 
often unfa\orable when analyzed by others (21*8,310,902), and its han¬ 
dling has been a matter ol such frequent criticism, that ethical and in¬ 
formed ps\(hologists are ichutant to use it despite some good and unique 
features. They also drew from the Bcrnreuter and the Bell, which were 
used in their first investigation (353), and made up other items ol tlicit 
own on the basis of psychiatric manuals and clinical experience. Items 
were assigned to scales on the basis of the extent to which they differen¬ 
tiated 221 classified psychiatric patients from 723 normal persons bung¬ 
ing relatives or friends to the University of Minnesota Hospital, 265 
college-entrance applicants at the University, and other similar persons 
presumed to be normal. The first clinical group consisted ol 50 carefully 
screened hypochondriacs (496), a cross-validation group of 25 hypochon¬ 
driacs, and control groups of 699 normals, 50 normals with physical 
disease, and 35 miscellaneous psychiatric cases. The hypochondriacs were 
significant!) (CAR. — 10.9) distinguished from the normals; the other 
non-normal gioups were also, but the overlapping in their cases was much 
greater (C.R. — .po and 2.5). The other clinical groups were equally small: 
the depressed also number 50 (35.]). But the tendency to distinguish ap¬ 
propriate gioups from others was in each case cross-validated and stood 
the test. The various scales of the Multiphasic may therefore be said to 
have been empirically developed and validated against appropriate ex¬ 
ternal criteria. 

Reliability. The test authors believe that the nature of the Multiphasic 
precludes the possibility of adequate indices of reliability (352:1020), be¬ 
cause of the variations of some of the traits from time to time within a 
given person, and because of the heterogeneity of the items which make 
up clinical synchomes in contrast with pure traits. They have reported, 
however, that the test-retest 1 ('liabilities range from .71 to .83; this is 
about as high as those of most personality inventories. An empirical check 
on the authors’ hypotheses concerning variations in scores would be desir¬ 
able; it would be possible, for example, to rate clinically studied individ¬ 
uals on these traits, and to relate changes in rated condition to changes in 
inventoried condition, thus ascertaining whether the somewhat lower 
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than desirable reliabilities are due to variations in the individual rather 
than to the unreliability of the instrument. Although such ratings are 
themselves not very reliable, if made each time by the same person they 
would presumably have a sufficiently high degree of reliability as indices 
of increase or decrease in the type of behavior under study. 

Validity. The clinical validity of the Multiphasic was reviewed by 
Kllis (238) in 19-16, by which time 13 clinical validation studies had been 
published. Ellis’ approach to personality inventories was hypercritical: 
he defined r’s of o to .19 as negative, .20 to .39 as mainly negative, .40 to 
.69 as questionably positive, .70 to .79 as mainly positive, and .80 up as 
positive, although he claimed on page 393 to “evaluate the reported 
coefficients of correlation in terms of the conventional estimations.” ;i 
daim subsequently modified. Nevertheless, he found that eight of these 
investigations yielded positive results, while three showed some validitc 
and only two failed to demonstrate \alidity in this inventory (the writer’s 
summation from data on pages J20 to 922). These figures were much 
better than those for the other imentories summarized by Ellis, the next 
best being only nine confirmations of the Bernreuter’s validity, while six 
studies showed some \alidity and 19 showed none, according to Ellis’ 
seveie and unconventional criteria. These data suggest that the Minne¬ 
sota Multiphasic has more validity for screening and classifying person¬ 
ality problems than any of the generally available personality inventories. 
Findings of some of the specific studies are discussed below, but no 
attempt is made to review them all in detail as clinical diagnosis is not 
the central interest of this chapter. 

In the development of the scales Hathaway and McKinley found that, 
despite overlapping of populations, from 50 to 80 percent of each their 
psychiatric ally diagnosed groups were differentiated from normal persons 
and genet ally e\en from each other by the scales for hysteria, hypomania, 
psychopathic deviation (355), hypochondriasis (pjG), psychasthenia (197), 
and depression (35.}). Although their groups were small, ranging from 
about 25 to 50 per category, the clinical diagnoses were carefully made 
and the trends were very suggestive. They were confirmed by most sub¬ 
sequent studies, as brought out below. 

The inventory was administered to 85 naval psychiatric patients by 
Benton (73), who found that five out of ten schizophrenics were differ¬ 
entiated by the appropriate scale, as were five out of nine hystericals, 13 
of 16 delinquents (psychopathic deviation), and nine of ten homosexuals 
(femininity). In another study he and Probst (74) tested 70 persons diag- 
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nosed by Navy psychiatrists; in this the psychopathic deviate, paranoid, 
and schizophrenic scales showed statistically significant differences be¬ 
tween clinical groups and normals, although the differences for the 
other scales were not clearly significant. 

Delinquent adolescent girls were compared with nondclinqucnt con¬ 
trols by Capwcll (137), who found that the former were clearly differen¬ 
tiated by all but the hysterical scale, the psychopathic deviate scale being 
the most diagnostic. As Van Yorst (891) reported negative results with a 
group of psychopathic delinquents, the subject needs further investiga¬ 
tion. 

Other psuhiatric groups were studied by Gouch (30.1), who found 
diffciences between the scores of normals and 136 neuropsvchiatric 
soldiers classified according to severity of neurosis, or as psychopathic 
deviates and psvchotics; Harris and Clnistiansen (313), who tested 53 
psvchiatrically diagnosed patients and found perfect agreement in mote 
than half the cases, and complete disagreement in about 10 percent of 
the cases; Leveren/ (]f>8) who used the test in an Army hospital and found 
it of “definite value” despite some disagreement with clinical diagnoses; 
Michael and Bidder (527), who tested 90 psychiatric patients in a general 
hospital and found it successful in only about .j5 percent of their cases 
and of little value in differentiating psychopaths from psvchotics; and 
Schmidt (byb), who found statistically significant differences between 
normal soldiers and those diagnosed as constitutional psvehopaths, neu¬ 
rotics, and psvchotics. 

Ellis suggests three possible explanations of the positive results gen¬ 
erally obtained with the Minnesota Multiphasic as opposed to the more 
commonly negative results reported in clinical studies of other inven¬ 
tories (238:] 23): 

1 . Individual administration may bring about, at least in part, the same kind 
of rapport factors which are so important in case study interviews. 

2 . Most individual administrations have been clone with one test, the Minne¬ 
sota Multiphasic, which was standardized on a decidedly clinical and objective, 
rather than the more usual subjective (internal consistency) , basis and which, 
in consequence, may possibly be a superior questionnaire. 

3 . The majority of Multiphasic validity studies have either been done on 
groups (similar to those) used to standardize the test, which have been in¬ 
stitutionalized populations which may be more sophisticated and more honest 
than other abnormal groups; or else they have been done with military person¬ 
nel, who may ha\e every incentive to answer personality questionnaires honestly. 
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Obviously, further investigations of these hypotheses are needed, and 
will probably be reported in the literature in due course. Wiener (924) 
has already shown that, at least with veterans coming to a guidance 
center, the means of those taking one form are the same as those of the 
men taking the other form. If this is verified by other approaches to the 
problem of form, in which the group form is given to groups and the 
individual form to individuals (procedure apparently not followed by 
Wiener, who used the two forms under identical conditions), Ellis’ first 
hypothesis must be discarded. 

Achievement criteria of various types are of principal interest to voca¬ 
tional guidance and personnel workers, who need to know not only the 
eflectiveness of the test in screening maladjusted persons who may need 
special attention, but also the significance, if any, of the trends measured 
for educational and vocational success. No studies of the educational 
predicti\e \alue of the Multiphasic have been noted in the literature, 
but one' paper advocating the use of the inventory in vocational counsel¬ 
ing, one stuciv of the relationship between Multiphasic items and occupa¬ 
tional success, and font on occupational differences revealed by the test, 
ha\e been located. These are discussed below. 

The "accumulated experience” of two veterans’ counselors who used 
this inventory in vocational counseling was described early in 19 jr, bv 
Harmon and Wiener (tlO- the tcst was less than two years old at 
the time at which they wrote, work with veterans was only getting under 
way, and their data were not quantitatively treated, their statements 
should probably be viewed as hvpothescs to be investigated rather than 
as findings to be applied in prac tical work. Statements such as one to the 
e ffect that the Multiphasic "has proved an instrument of prime utility’’ 
which "has served to delineate personality characteristics of crucial im¬ 
portance in the actual choice of a vocation and has yielded valuable 
information to aid in prognosis of success in training’’ were made, it 
should be noted, before the veterans in question had tested the choices 
made in actual work and before they had had any opportunity to achieve 
success or failure in training. The case studies presented are more con¬ 
vincing as evidence of the usefulness of the inventory in locating persons 
who need psychotherapy before they can function in any kind of work, 
than as evidence of its value in aiding in the choice of occupation or of 
type of training: in only one of the six cases did it really play a differ¬ 
ential role in vocational counseling. 

Success in flying training was the criterion used in a study reported by 
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Guilford (316:599-601). The group form was administered to 856 would- 
be pilot cadets in 1944, and the items were validated after reports ol 
their success or failure in primary training became available. It was 
decided to validate the published scales only if a sufficient number of 
items were valid to justify scale validation. The phi coefficients were 
unimodally distributed with a central tendency at zero, indicating that 
few if any items had any genuine validity for success in flying training. 
The clinical scales were therefore not correlated with the criterion. 

Occupational or prcocntpational differences were studied in two in¬ 
vestigations by Lough (485,486). In the first she found that 185 unmarried 
women undergraduate students of education were a relatively stable 
group with a very slight tendency toward hypomania and that there were 
no significant differences between those preparing to be elementary or 
music teachers. In the second paper she reported findings for 300 un¬ 
married women undergraduates, including the original group and 115 
liberal arts college students. A slight tendency toward hypomania was 
found in the new group as in the original, suggesting that this might be 
characteristic of adolescents. There were no differences between cur¬ 
ricular groups, to which nurses and the various liberal arts majors were 
added. She concluded: “it is not a useful instrument for differentiating 
between those who are more suited for one occupation than another. 
The primary' value of the MMPI seems to be to give some insight into 
the emotional life of the individual and to detect those who may be in 
need of psychological or psychiatric counseling.” It should be noted that 
her first conclusion is based not on success but simply on choice (a some¬ 
what questionable criterion, as some who choose fail and some who might 
succeed do not choose), and that her second conclusion is based on the 
evidence of other studies, reviewed in earlier paragraphs ol this section. 
The writer is inclined to subscribe to her conclusions, but the first one 
at least needs further proof. 

Women clerical workers, department store saleswomen, and women 
optical workers were tested by Verniaud (893), the samples numbering 
40, 27, and 30 respectively. The workers came from several different 
offices and stores, and from several departments of one factory. The 
profiles of the two white-collar occupational groups differed very little 
from the norms, except for somewhat low hypochondriasis in the clerical 
workers and decidedly masculine scores in the salesclerks, but the optical 
workers were decidedly hypomanic and psychasthenic, and somewhat 
paranoid and psychopathically inclined. 
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In view of what is known oi the interest patterns of women clerical 
workers, it is not surprising to find them a normal group, resembling 
women in general. The masculinity of department-store salesclerks is 
surprising, as their woik is a relatively passive type of sales and they deal 
largely with feminine items; as Verniaud points out, this finding would 
bear further investigation. 

The findings concerning the factory women raise several questions, for 
they may be peculiar to the local situation (one company in one town), 
to the occupation (blocking, roughing, emery grinding, polishing, finish¬ 
ing jobs), to the soc io-occupational level, or to the population (e.g., a 
minority group). There is no description of the status of the women but 
the factory wotkers were all employed on war jobs (Navy contracts), 
whereas the others were engaged in more normal, peacetime, operations 
rather than in war industries. This suggests that they may have been a 
quite atypical group of women workers: drifters, thrill seekers, and 
others who might flock to a boom industry on a temporary basis. 
Verniaud does not go into this possibility, but does state that “In terms 
of the expected meanings ol the characteristics (MMPI scales), we would 
expect these workers as a group to be restless, ‘full of plans,’ alternating 
between enthusiasm and mer-pioductivity in energy output and moods 
of depression, more inclined toward anxieties and compulsive behavior 
than the average indhidual, disinclined or unable to concentrate foi 
long periods on one task, somewhat oversensitive or suspicious of the 
good-will of others, somewhat more inclined than the average woman to 
disregard social mores.” Only three sample case studies are presented in 
this report of her master’s thesis, but Verniaud states that the test profiles 
are borne out by case-study material which she collected in the thesis, 
before any vocational guidance or selection applications are made of 
such findings, it would be imperative to ascertain whether the factory 
workers whom she studied are in fact typical of women factory workers 
in general, this type of occupation only, only this plant at this time, or 
merely of women war-plant workers. This last type of group no longer 
has occupational significance but must, if accurately described by these 
findings, still be unhappy and making unhappiness for others. 

Life insurance salesmen and women social workers, 50 subjects in each 
group, were studied with the Multiphasic by Lewis (469), each group 
being compared with the norm group of the same sex. The insurance 
salesmen were found to be significantly more depressive, hysterical, psy¬ 
chopathic, feminine, paranoid, and hypomanic; the last-named had the 
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highest T-score (58.1), while only femininity and hysteria reached a 
T-scorc of 55. The social workers were significantly high on the depres¬ 
sion and hysteria scales, significantly low on those for masculinity, hypo¬ 
chondriasis, psychasthenia, and schizophrenia, their psychopathological 
sophistication being perhaps a contributing factor to then low scores; 
for this reason pre-training tests, evaluated after employment in the held, 
would have provided more convincing evidence of occupational differ¬ 
entiation in this type of work. Lewis also found that those whose inter¬ 
ests, as measured by the Kuder-Preference Record, were least appropriate 
lot their wotk, tended in cadi occupation to be the least well adjusted, 
but the differences were not clearly significant in most comparisons. 

Job satisfaction has not been studied by means of this inventory, 
although the findings just reviewed have implications for that topic if 
confirmed by other studies. 

Use of the Minnesota Multiphasic Personality Inventory in Counseling 
and Selection. In a lew years there will probably be enough accumu¬ 
lated evidence concerning the traits measured by the Minnesota Multi¬ 
phasic to justify a discussion of their significance paralleling that lor the 
Bernreuter, but all that can be written at this stage of its development 
would concern their clinical significance rather than their vocational 
implications. Such material has a very important place in a manual of 
clinical psychometrics, but not in a book designed for use in counseling 
concerning vocational choice, selection, or upgrading. Not, that is, until 
more is known about the vocational significance of clinical data. It is 
enough for our purposes to state that the authois’ claim that persons who 
make extreme scores on any of the scales piobably need psychothcrapv 
seems valid, as high scores have generally been found to chai ac teri/e 
appropriate clinical groups. A high score may be defined as a T-scoie 
exceeding 70. 

Occupations which may be appropriate or inappropriate for those who 
make extreme scores on this inventory cannot as vet be listed, il indeed 
they ever will be. We have seen that one investigator concluded that the 
real value of the instrument is in clinical rather than in vocational diag¬ 
nosis. There arc indications that hypomania, hysteria, and femininity 
may be characteristics which make lor success and satisfaction in selling 
life insurance; that depressive and hysterical tendencies may be sugges¬ 
tive of social work for women; and masculinity may make sales work a 
suitable outlet (other things being equal) for women. Other possible 
vocational implications of this inventory need confirmation with larger 
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and more representative groups whose background and working environ¬ 
ments must be carefully described lor the data to be meaningful. In the 
meantime Verniaud’s suggestion that the Multiphasic be used only as a 
clinical instrument seems to this writer to be the only one justifiable at 
this stage of its development. 

In schools and (alleges the Minnesota Multiphasic Personality Inven¬ 
tory may thetelore be useful as a device lor screening students in need 
of further study and perhaps of counseling in telation to personalitv 
adjustment; more often, it is helpful as a diagnostic device following such 
screening bv other less elaborate imentories or after referral by other 
stafl members, to provide the* counselor with some orientation to the 
nature and extent of the maladjustment. It is not recommended as an 
aid to \ocational counseling except when the counselor is also a clinical 
psychologist and the* client is a maladjusted person in need of help with 
an immediate problem of \ocational choice or adjustment. 

In gunhnne and cm ploy went (enters the Multiphasic has more of a 
place because the larger number of persons with personality problems 
who come to such centers makes carelul screening imperative. This iri¬ 
se n t o r \' mat therefore be helpful in a secondarv test batten when a 
shorten routineh administered personality inventory, the psychometr ist’s 
or pieliminan inters iewei \ observations, or the referring source suggests 
the pi essence of psvc hopathologv. Positi\e findings would then be an 
indication of need for therapy beyond the scope of the typical \ocational 
or placement counse lor, or lor co-operative work with a psychotherapist, 
the* vocational counselor helping the client to make a vocational adjust¬ 
ment which contributes to his general adjustment by making one aspect 
of his life that much more successful and satisfying. Differential occupa¬ 
tional prediction on the 1 basis ol Multiphasic scores, such as is suggested 
and practiced by some counselors, is still premature except in a highlv 
tentative way and on the 1 basis of confirmation bv case-historv material. 
In evaluating persons being considered for referral for employment or 
referred for- evaluation bv emplovcis the inventory may have some valut¬ 
as a screening or selective-placement device-, but in view of what is known 
about the faking ol scores on other inventories the results in such cases 
should be very cr it it al Iv viewed. 

In business and indust>*y this inventory may be helpful as a means of 
screening out maladjusted employment applicants, as those who make 
high scores are extremely likely to have personality problems; but low- 
scorers may include many who are merelv successful as disguising their 



510 APPRAISING VOCATIONAL FITNESS 

true characteristics. It may also be of use in personnel evaluation, either 
for the selective placement of handicapped persons or for the improve¬ 
ment of supervisory and executive functioning. In this type of work the 
interpretation should be done only by a qualified clinical psychologist, 
as the results might otherwise be bad both for the individual and for the 
company, and referral facilities should be available if psychotherapy is 
indicated. The inventory may prove to have value in the selection of 
salesmen and other contact personnel, and perhaps with other types of 
employees, but local validation and normative studies must be carried 
out before such use is possible. 

The Bell Adjustment Inventory (Stanford University Press, 1934, 1937) 

This is another widely used personality inventory, particularly in 
schools and colleges, but it has not had particular appeal either for clini¬ 
cians or for industrial personnel workers. Published three years after the 
Bernreuter and scorable for four aspects of adjustment, somewhat difler- 
ent in superficial ways from the Bernreuter and its predecessors, it some¬ 
how escaped the mote violent criticisms leveled at them and caught the 
second crest of the wave of popularity of personality inventoiies. Perhaps 
it seemed sufficiently different from others to be “worth trving” as the 
search for an effective personality inventory continued. New types of 
inventories such as the Minnesota Multiphasic Personality Inventory 
had not been published as yet, the Humm-Wadsworth was criticized by 
the Bcrnreuter’s critics, and projecti\e techniques had not yet become 
generally known. Bell’s monograph (62) gave users of his inventory the 
feeling that they knew something about the instrument, and the names 
of the traits it measured had a safe and homely sound quite different 
from the trait-names of the much criticized inventories. 

Description. The Bell Adjustment Inventory is published in two 
forms, one for students and one for adults, and is scorable for loin 
aspects of student and five of adult adjustment: home, health, social, 
emotional, and, in adults, occupational adjustment. It is designed for use 
in high schools and colleges, and with adults. Although it has been sug¬ 
gested that some of the items make it offensive to some people, Pallistei 
and Pierce (584) reported that they found it quite acceptable to the 
Scottish subjects with whom they worked. The blank consists of about 
100 questions like those in other inventories, although some of the ques¬ 
tions which are treated as health questions by Bell are weighted foi 
neuroticism in most inventories (e.g., “Do you have many headaches?’ ). 
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Responses are of the yes-no type, and are recorded on the blank or on 
IBM sheets. It is self-administering, with no time limit, and requires no 
more than 30 minutes. Scoring is quickly done by means of stencils, each 
response being given a weight in one scale only, and the score is the 
sum of the circled responses. Norms are provided for high school stu¬ 
dents, college students, and adults; these have been criticized by Tyler 
( 880 ) because of the use of a five-point scale which gave undue weight to 
(hanges of a few responses; Bell (63:995) considers the norms tentative 
despite the lapse of more than ten years since publication, and recom¬ 
mends the development of local norms. The reliability of the inventory 
has generally been found satisfactory for group purposes but somewhat 
low lor individual diagnosis, Turney and Fee (884) reporting retest 
reliabilities ranging from .74 to .85, and Traxler (857) odd-even reliabil¬ 
ities ol from .83 to .93. 

]’nlidity. In the development of the forms (62) items were used which 
distinguished the high- from the low-scoring groups of students or adults 
on whom they were standardized. Stoics weie correlated with those ob¬ 
tained In existing imemories, and the coefficients ranged Irom .57 to 
.89 lor appiopriate scales. Students and adults designated In counselors 
who knew them as well or pooilv adjusted in each area were found to be 
distinguished quite significantly by appropriate scales. 

The (Iniical valul/ty ol the Bell has been disappointing. Ellis’ sum¬ 
mary of published studies (238) reports 12 investigations of the inven¬ 
tory, 1 1 of which showed that it had little or no value for the identifica¬ 
tion ol maladjusted persons, and only one of which showed positive 
results. Readeis who wish to look into the details arc referred to Ellis’ 
concise and well-oigani/cd, even though severe, summary. The writer has 
located only one in\estigation missed by Ellis, and although it (938) is 
fa\orahle one such study cannot change the picture presented by studies 
such as those by Marsh (511) and Feder and Mallett (252) in which the 
inventory was found to have very little value in screening students in 
need ol psychotherapy. 

(trades were correlated with Bell scores by Drought (213), Young et al. 
(950), Claik and Smith (160), Crider (181), and Griffiths (313) with results 
which were negative; they generally have been in such studies. Only 
Fischer (256) has reported positive results with this inventory, using an 
index different from those of the other studies. He constructed an under¬ 
achievement ratio based on scholastic aptitude and grades, which cor- 
1 elated .42 with Bell’s emotional adjustment score, suggesting as in some 
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ol the studies with other personality inventoiies that when intellectual 
(actors are taken into account, personality traits play an observable part. 
The delect may therefore lie, not so much in the inventory, as in the 
design of the studies. Fischer lound a correlation between emotional 
adjustment and point-hour ratio of —.32; as his cases numbered only ,jS 
his findings are merely suggestive, but may be worth following up. 

Success on the job has been correlated with Bell scores in only one 
known study, in which Forlano and kiikpatrick (i>(>8) tested 20 women 
radio tube mounters. No data are given lor the Bell alone, but only lot 
a combination ol its social adjustment score with that ol Washburne’s 
inventory. Twenty cases are too lew lot tlu- tesults to be* conclusive, but 
it is interesting that all eight employe es rated "good” in elite iency by 
their supervisors made aveiage or better scores on the' inventoiies, while 
the 12 who weie rated ‘‘lair” made aveiage 01 below aveiage adjustment 
scores. 

Occupational differences have not been studied in employed persons 
by means ol this inventor), but McCaithv (jcj2) administeied it to semi¬ 
narians, and found them below average in total adjustment on the Bell 
as on the Bcrnreutei. Whethei this was a personahtv pattern which 
existed before entrance- into the piiesthood, 01 meielv a transitory icflcc- 
tion of the experiences these men weie undergoing in tiaining, was not 
brought out h\ the study. 

Job satisfaction has not been studied with this inventory, although tlu* 
inclusion ol a number of questions bearing on this in the adult loim 
might have been expected to be the result ol 01 to encourage such studies 
Only Seagoe (GScj) has touched on this subject in bet study ol permanence 
in teaching, in which, as we have seen, theie was a slight tendency for 
well-adjusted studcnt-tcacheis to remain in the ptolession, together with 
the less-intelligent maladjusted, while- the brighter maladjusted tended 
to leave for other types ol emplovment; but these differences weie not 
statistically sign i ftca n t. 

Use of the Bell Adjustment Inccntory in Counseling and Selection. 
Unlike other personality inventoiies, this instrument attempts to meas¬ 
ure not only traits (emotional adjustment or stability) but also degrees 
r>f adjustment in several areas (’home, social groups, and health). 'This 
seems to have been clone on the assumption that it would be helpful to 
know which area is the most active source of maladjustment, which the 
source of most security and satisfaction. The interconelations ol the 
several kev s icpoitccl by Bell (hii), Tyler (Ntth) and otheis (Ts-~.oj to 
.53) indicate- that there- is some overlapping ol the scales, but they are low 
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enough to suggest the conclusion that several factors are being measured. 
Rut the Bell total adjustment store has a correlation of. .77 with Bcrn- 
jeutei’s emotional stability scale (602), and Turney’s criticism that “It 
must require considerable faith or temerity to believe that 1 jo items, 
averaging 35 to a division of the kind in the scale (the lour scales), really 
measure adjustment in a satisfactory manner. The complexity of the 
psyc hobiologic al environment must have been grossly overestimated by 
a host ol psychologists il wc* are mistaken about this” (884), seems to 
contain a possible explanation of the low intercorrelations. It may be 
that each set ol 97 items is merely a sample of the items which would 
make up an emotional stability scale, their low intercorrelations being 
due to the* inadecjuacy of then sampling of the various ways in which 
(‘motional stability or neurotic ism manifests itself. This has been sug¬ 
gested also bv Young. Drought, and Bergstresser (950). If this is so. the 
study of loci ol maladjustment nrav still be helpful, but the danger of 
making the deduction that a certain individual is “well adjusted socially 
hut not emotionally “ should be clear Iv recognized. 

The <)( ( uRational s igni/ii ance ol the Bell is unknown, as no adequate 
investigations have been mack*. 

In schools and colleges this inventory may have some value as a screen¬ 
ing instrument lor the location of maladjusted students, but other meas¬ 
ures have been proved more effective. 1 he value of the part scenes as 
indices of the- loci ol maladjustment has baldly been demonstrated, and 
the writer believes that once problem cases have been located bv other 
screening instillments such diagnostic matters can be better handled bv 
interviews or bv projective tests such as the Thematic Apperception lest. 
There is no evidence that the inventory has any value for directional 
vocational or educational guidance. 

In guidance centers also the use ol this instrument hardly seems war¬ 
ranted bv what is known about it. Other inventories can screen 11101 c* 
eflectively and diagnose* 11101c significantly*, and the clinical techniques 
are in any case* rather readily used in such situations. 

In employ went sconces, business , and industry other inventories and 
tests which have been studied with vocational purposes in mind have 
been demonstrated to have some value, whereas there are no data which 
indicate that this measuie will help in personnel work. 

I'he Minnesota Personality Scale (Psychological Corporation. 19 \i) 

I his inve ntory is inc luded in this chapter, not because there is any 
evidence of its validity in vocational counseling or personnel work, nor 
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because it is widely used and needs to be understood, but simply because 
it is the most recent step in the evolution of a number of attitude and 
personality scales which have been carefully studied. They have (on 
tributed to our knowledge of social psychology if not actually to oui 
proficiency in vocational psychology. The first milestone was the develop¬ 
ment of a technique for the study of attitudes by Thurstonc (841), sim¬ 
plified and applied more specifically to the study of morale by Likert 
(17 0 an <I used effectively in a study of attitudes and unemployment by 
Hall (324). The technique was further refined in an intensive psy¬ 
chometric study of the effects of the economic depression of 1929-39 
on personality, carried out by Rundquist and Slctto (fir,8) and resulting 
in the Minnesota Scale for the Survey ol Opinions (the intensity applied 
more to the psychometrics than to personality). This inventory gave 
scores for morale, feelings of inferiority, family attitudes, attitudes to¬ 
ward the legal system, economic conservatism, attitudes toward educa¬ 
tion, and geneial adjustment, attitude variables which it was thought 
might be a flee led by prolonged unemployment. The present stale.* is the 
lesult of a factor analysis of this and several other inventories bv Harley 
and McNamara (192). 

Description. 1 his inventory consists ol five parts, or a total ol 218 
questions, the sections being designed to measure morale, social adjust¬ 
ment, family relations, emotionality, and economic conservatism. Typical 
items are: 

Court decisions are almost always just. 

There is really no point in living. 

I)o you have a fairly good time at parties.' 

Do you and your parents live in different worlds, as far as vou are concerned? 

1 he answers are arranged on a five-point scale of frequents or intensity, 
depending on the trait. It is designed lor use in the last two years of high 
school, in college, and with adults; there are two forms, lot men and for 
women. It can be administered iir less than 95 minutes to persons of 
these educational levels, and is scored by stencils or by IBM machine. 
Norms are for 2000 men and women freshmen at the University of Min¬ 
nesota; local norms would be needed if the inventory were much used, 
because of the differences found in the attitudes measured by some of 
these scales with differences in economic status and degree of sophistica¬ 
tion (379). The scales are quite reliable, ranging from .89 to .97 when 
computed on a corrected odd-even basis (manual). 
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Validity. The items were selected after a factor analysis and other 
studies of the Minnesota Scale for the Survey of Opinions, the Bell 
Adjustment Inventory, and the two Minnesota Inventories of Social 
Attitudes, which showed that the 13 scores ol these inventories could be 
explained by the five factors measured by the scales now comprising the 
Minnesota Personality Scale. 1 he inventory is therefore internally con¬ 
sistent, and incorpoiates the best elements of its predecessors. Thus re¬ 
fined, the Bell items take on a different character, for they aie part of 
internally consistent and relatively distinct factors: for example, the best 
Bell health items became part o{ the emotionality scale. The authors be¬ 
lieve (manual) that these new scales should have at least the validity of 
ihe patent scales; although this does not suggest much value lor the Bell 
derived scales, it should be remembered that they were technically im¬ 
proved and pel haps given new validity (which still needs to be proved); 
and the Minnesota Scale for the Survey of Opinions was shown by 
Rundcjuist and Sletto (f>y8) to have value for the study of the attitudes 
and adjustments ol the uncmploved. Validation of this instrument 
against external criteiia is needed be lot e it can be useful in practice. 

Ratings ol c 01 responding tiaits in 235 student nurses were made by 
their supervisors and colleagues in a studv by Bennett and Got don (71). 
which showed little relationship between the two sets of data. This has 
geneiaiiv been the case when ratings have been correlated with inventory 
scores, and mav only prove that latings are of little value. 

Suites s' in living tr.lining was the criterion used in a studv initiated 
by the writer and completed by Guilford (316:bo 1-603). A group of 33S 
would-be pilot cadets who took the test early in 1944 were sent to pilot 
training, and subsequent reports of their success and failure were cor¬ 
related with the scale’s five scores. The biserial coefficients of correlation 
ranged from —.09 to .oj, showing no validity for this purpose. No other 
validation studies have been located. 

Use of the Minnesota Personality Scale in Counseling and Selection. 
As there is no objective evidence on which to base suggestions for the use 
of this attitude and personality inventory in educational and industrial 
personnel work these paragraphs arc limited to a few suggestions con¬ 
cerning possible values. Even the best predecessors of this inventory were 
never more than attitude-research inventories which were not put to use 
in personnel work to any appreciable extent. Despite this, the Personality 
Scale is technically good enough to merit research in practical situations. 
Were it not for a few items which would probably not be acceptable to 
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many workers (e.g., one on the C.I.O.), it might be considered lor use in 
morale surveys, when it is desired to obtain data not only on satisfaction 
with the specific aspects of the job and working conditions usually cov¬ 
ered in such surveys, but also on general morale and aspects of emotional 
adjustment which are often the underlying causes of job dissatisfaction. 
The acceptability of the items to employees must first be ascertained; if 
unacceptable in their present grouped lorm, some could be modified and 
they might be made more palatable by putting them in omnibus form 
and thus burying the more personal items in fairly innocuous material. 
Similarly a counselor in an educational institution who desires data 
concerning the climate of student opinion may find a survey with this 
scale helpful; it would not need modification foi college use. Those whose 
scores deviate considerably from the mean may be observed or inter¬ 
viewed in order to ascertain the ellect of their atypicality on their status 
in the group. This application might also be made in industrial situations 
if respondents were identifiable, but it seems likely that best lesults 
would be obtained by securing anonunous responses when administra¬ 
tive action might be feared. 

The Rorschach Inkblot Test (Grune and Stratton; Inst published in 
Switzerland in 1921) 

This series of inkblots was developed by a Swiss psychiatrist, Hermann 
Rorschach, as a measure of the underlying structme of the personality, 
was experimented with by him and his students for a number ol veais 
before it was intioduced into the United States during the kj^o’s, and 
has grown rapidly in popularity as a clinical instrument since that time, 
becoming practically a cult in some circles. The result has been a vast 
amount of publication concerning it, and some reseatch, most of the 
writing being concerned with its use in personality study and clinical 
diagnosis. Although some proponents of the technicjuc have advocated 
its use in vocational guidance' and selection (e.g., 607), little evidence has 
been adduced to justify or contradict the sweeping claims made for it. 

The fundamental differences between this and other types of tests, the 
varied aspects of the personality which it is purported to measure, the 
internally consistent logic upon which it is based, and the chamatic use 
to which it has sometimes been put, have given the Rorschach a wide 
appeal; at the same time, the enthusiasm of its proponents and the extent 
to which it has been based on clinical intuition and subjectively rather 
than quantitatively analyzed experience have antagoni/ed many more 
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sc ientifically minded psychologists. However, the proper approach is an 
open-minded study of the instrument in which one can assess its demon¬ 
strated value and establish hypotheses concerning its potential value, 
which can then be tested experimentally. It is in that spirit that the 
waiter has attempted to deal with it in the following pages, for though 
he has used the inkblots both in research and in clinical practice, and 
believes them to be of value, he is not a “Rorscliacher” or a eultist. 

To attempt to treat the clinical validity of this complex and subjec¬ 
tively scored test is unfortunately too sizable a task for a book such as 
this. To explain the technique alone requires a whole volume, as has 
been shown by Rorschach (f>p|) and later by Beck (55,56,57), Bochner 
and Halpern (10S), and Klopferand Kelly (IBB): d volume on its validity 
is also needed, but has yet to be produced. The pattern followed in 
discussing personality inventories will therefore be departed from, and 
this section will attempt only to describe the test in sufficient detail to 
provide an 01 irritation to the procedure and to the nature ol the test, 
and to discuss the studies which have been made of its significance for 
educational and vocational counseling and selection. Its clinical validity 
will not be tieated, a decision which seems justified also by the* fact that 
the inkblots are in am case a diagnostic lather than a screening device. 

Desc) / ption. The Roischach Inkblot Test was designed originally for 
use in the diagnosis ol psychiatiic disoitiers in adults. It has since been 
used, however, with noimal adults, adolescents, and children, and has 
been found applicable to any pci soil ol school age provided the inter¬ 
pretation is made in terms of the age group to which the examinee be¬ 
longs. The test consists ol ten white cards, on each erf which is reproduced 
one laige inkblot. Some ol the inkblots are monotones (gray), while 
others include color. The test is administered individually in clinical 
and sometimes in personnel practice, the examinee telling what he 
thinks each inkblot might be. the examiner ret01 ding responses on a 
blank which includes outlines ol the inkblots; this is followed by an 
inquiry in which further details concerning responses arc elicited, and 
by a testing ol the limits, in which the psychologist ascertains whether or 
not the examinee is capable of giving certain types of responses which 
he has not previously given (56,^33). When used for screening (as it 
occasionally is), for personnel selec tion, or for research the test is often 
administered as a group test, the inkblots being projected onto a screen 
and the examinees recording their responses on diagramed blanks; this 
is followed by a modified inquiry, in which the examinees locate their 
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responses for the examiner; there is no testing of the limits (346). \ 
multiple choice form of the group test has also been developed (346), of 
doubtful value as seen below. 

Scoring the Rorschach is a time-consuming task when it is desired to 
obtain a detailed clinical picture of the person being studied, and often 
takes two or three hours. When Munroes inspection technique (551) is 
used merely in order to derive an index of total adjustment the time may 
be reduced to 15 minutes per examinee. In either case the person doing 
the scoring must have had intensive training in the use of the test, com¬ 
bined with a good background in clinical psychology, for despite the 
lengthy and helpful discussions of scoring now available (57.-133) the 
procedures are quite subjective. Some users of the test are, in fact, con¬ 
vinced that to objectify the procedure would be to destroy its clinical 
value (433:20-21; 432; 57;vii). 

The norming of the Rorschach has also been a sore point with many 
psychologists. In general, Rorschacbers have felt that clinical experience 
and insight are sufficient to justify the interpretations commonly made, 
and Rorschach's original insights are often appealed to as evidence of 
the significance of a response. Others have been concerned with the 
accumulation of norms for the various types of 1 espouses, lor various 
normal and clinical groups, in order that the clinical significance of a 
response might be objectively demonstrated and \erifiable by reference 
to quantitative data. For example, Beck’s first monograph on the ink¬ 
blots was quite normative in its approach (55), but his two later books 
(56,57) have been more subjective and more dependent on clinical intui¬ 
tion. It is this very lack of objective norms for many aspects of the test 
which makes clinical training and experience necessary to the users of 
the Rorschach; it also makes essential a scientific attitude and a tendenc\ 
to seek objective evidence to justify clinical intuitions. The problem is 
not as simple a one as to collect or not to collect norms, however, as the 
scoring and interpretating often require the relating of one variable to 
others in ways which do not lend themselves well to quantitative treat¬ 
ment as we now understand it. 

As responses and scoring have so far been discussed in abstract terms, 
it may be well to make the subject tangible by describing some types of 
responses and their scoring. One inkblot, for example, may look to the 
examinee like a leopard skin, and the inquiry may explain that this is 
because of the shape and because of differences in the furry texture. In 
scoring this response three items are of interest; the examinee responded 
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to the whole picture rather than to details, seeing the inkblot as a unit 
lather than as a number ol disparate units; he responded thus partly 
because of the form , and partly because of the texture which he used as 
color. These three items are added to similar items obtained in response 
to other pictures, giving scores, respectively, for the W response, F, and 
Fc. Interpretation of the test then proceeds on the basis of each of these 
scores, seen in the light of other related scores. IV is thought of as reveal¬ 
ing a tendency to respond to wholes, to organize and synthesize; a higli 
score is taken as revealing superior intelligence, unless it is so high or 
so superficial as to take on another meaning. F is thought of as a sign 
ol emotional control, although if it is high and certain other indices are 
low it may mean rigidity. Fc, or the use of texture, is interpreted as a 
sort of shock absorber, of controlled sensitivity to the environment. A 
\aiiety of other modes of response to the inkblots, and the content of the 
responses, arc also analyzed, various ratios are computed, and a profde 
is plotted in order to facilitate study of the pattern of responses. A verbal 
summary or personality sketch is then prepared on the basis of this 
analysis. Most of the justification for these interpretations, it should be 
emphasized again, lies in the intuition of clinicians who have used the 
test and studied the responses of persons whom they had come to know 
well by other clinical methods. Only a few of them have been validated 
by objective methods. 

I'ahdity. It should be clear from what has preceded that the validity 
of the Rorschach in personality diagnosis has been demonstrated largely 
by the extent to which clinicians have thought it agreed with psychiatric 
diagnoses. Studies with the Rorschach have been reviewed by Hertz 
(367,368); subsequent reviews have been published by White (921) and 
Kaback (41.1). The studies reviewed below are selected because they deal 
with the vocational significance of Rorschach indices. 

Grades in college were used as a criterion against which to validate 
the total adjustment score of the Group Rorschach in a study by Munroe 
(552). Her subjects were students at Sarah Lawrence College, where 
grades were not those usually given for specific course work, but faculty 
ratings of academic standing, a more general evaluation of the student’s 
status. The correlation was .49, as contrasted with one of .39 for the 
A.C.E. Psychological Examination. It would be desirable to have similar 
data for colleges in which more traditional marking methods are used. 

Success on the job has been studied with the Rorschach only in un¬ 
published investigations, so far as this writer knows. One large depart- 
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meat store has gathered data on executives employed during recent years, 
testing them during the selection process and relating t heir scores to 
ratings of subsequent success. Although the group in question is still 
small (N — go) and the results tentative, they appear to be promising; 
however, preliminary findings such as these arc often reversed when 
studies are completed. The Group Rorschach was used in a study of 
aviation cadets made with the assistance ot the Josiah Macy Foundation 
early in World W ar II (oral report to officers of Psychological Research 
Unit No. 1 by Miss Sadie Sender), the design of which was defective in 
that cadets tested after failure were contrasted with successful pilots: 
other studies had shown that eliminated cadets showed many symptoms 
of depression early in the war, thus making their Rorschach scoies ques¬ 
tionable. It was administered also to 66o aviation cadets tested in the 
Aviation Psychology Program of the Army Air Forces in 1943, low 
validities of doubtful significance being found for a few scattered single- 
indices which, when combined, gave a biserial coefficient of correlation 
of .17 w ? ith success in pilot training (84); however, when this formula 
w r as cross-validated on another group of 156 cadets it had a negligible 
validity of .oj (797:555; 316:633). The Multiple-Choice Rorschach was 
validated against success in pilot training with negative icsults (316:636). 
in other similar studies the results were no better. The only conclusion 
one can draw from these various studies is that if the Rorschach has 
validity for the selection of personnel lor various types ol work (or, bv 
implication, the counseling of people concerning the appropriateness ol 
vocational choices), there is as yet no evidence to indicate just what single 
or combined Rorschach traits might confirm one choice or contraindicate 
a nother. 

Occupational differences as shown by the Rorschach have been studied 
by Kaback (414) and by Roc (635,636). Kaback used the Group Rorschach 
results ol 300 pharmacists and accountants, dividing them into profes¬ 
sional and preprofcssional (student) gtoups. She found point-biserial 
coefficients of correlation (to be distinguished from biserial coefficients) 
of .54 and .65 between 24 Rorschach components and professional or pre¬ 
profcssional group membership; in other words, there was a statistically 
significant relationship between Rorschach pattern and occupational 
group membership. Kaback points out, however, that the overlapping of 
groups is so great as to make the application of her findings to individ¬ 
uals highly questionable. The picture is further confused by the finding 
of equally great differences between the employed and student groups 
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(point-biserials of .(>25 and .62), the thumbnail sketch of the employed 
pharmacists having practically no resemblance to that of the student- 
pharmacists although those of the two accounting groups are more 
similar. The sketches of the two professional groups arc summarized here 
as illustrations of Rorschach results. 

Pharmacists: intelligent adults whose impulse control functions well 
in general with one limitation: their conscious repression of impulses 
(F^r ~ j 7) plays a relatively gieat role, and inner stability a relatively 
smallei role (M — 2.09). Fairly marked amount of anxiety (presence of 
K and k responses) which is (ounterbalanced by sensitivity to inner and 
outer conditions (FK -f Fc:FK -f K = 2.28: i. 19). Intellectual flexibility 
markcd(W, I 3 ,cl,S present). Spread of interests somewhat limited (H -f A: 
r, other content categories). However, in terms of general adjustment, the 
group falls within the general range. 

Accountants: superior adults. Well-balanced impulse control; function 
smoothly in conscious impulse control (F r / v —44), rational behavior in 
emotional situations (FC: CF -p C = 92:59), inner stability (number M 
present). This group has a tendency to attend more to stimulations from 
within than to external stimulations (M:sum G == 3.1) and to use them 
productively (W:M — 12:3). Conscious control refined by use of shock¬ 
absorbing junctions (FK -f F -f FCR = 56) by being sensitive to inner 
and outer conditions. Small amount of anxiety (some K and k) and a 
slight tendency to o\ei cautiousness in emotional contact with outside 
world (FC -f CF -f C:Fc + c 4- G' = 1.5:3). Good mental elasticity (W,D, 
d,Dd,S present) with widespread interests (II -f A:8 other content cate- 
goi ies). In general, a we ll-adjusted group. 

Artists were the subjects ol Roe’s first study (635). They were a group 
of 20 eminent American artists whose co-operation was obtained in an 
investigation of the effects of alcohol on the creative process. Her results 
showed that the group was extremely heterogeneous on the Rorschach: 
general adjustment scores ranged from 3 to 18 with a mean of 10.3, about 
that which Muntoc found predictive of maladjustment in college (552) 
and higher than the mean of 7:7 which she found in paleontologists. No 
control group was used, all other comparisons being with the extremely 
subjective standards of Rorschach tradition. The artists tended to make 
more than an average number of whole responses, responded more to 
color and shading than mcn-in-general, and gave rather more than the 
normal proportion of anatomical and sexual responses, neither of them 
suiprising in trained artists. "Flic protocols of the tests were submitted 
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to a noted Rorschach authority (Dr. Bruno Klopfer) for “blind” analysis 
(interpretation without benefit of other case material), his only data 
being age, sex, and the fact of their being professionally successful. Two 
of the protocols made it obvious that the men in question were connected 
with art: Klopfer noted this, and stated that one was probably a success¬ 
ful creative artist, but that the other was so lacking in creative ability 
that it was improbable that he could be successful at it professionally. 
Creative ability \\as noted in five others, but said to be limited in one 
and unusable in another because of neurotic conflicts; in live others it 
was said to be absent, and he implied its absence in three more; no 
lclevant comments were made concerning the other five, which implies 
no notable creative ability. These findings are impoitant, for Rorschach- 
ers have without objective evidence set much store in the inkblots’ ability 
to reveal creative ability, whereas these two competent Roischachcis 
(Klopfer and Roe, who made a similar analysis before she asked Klopfer 
to check hers) failed to find signs of creative ability in 15 out of 20 
eminent artists. As Roc points out, creativity may not be required for 
success in art in our culture; but the law of parsimony would seem to 
require one to question an unvalidated assumption concerning a test 
before questioning cultural standards. Roe’s general conclusion was that 
despite some ttends, as noted above, “there is no personality pattern 
common to the group.” 

Vertebrate paleontologists and technicians assisting them were tested 
in the other studs by Roe (63b). The two groups numbered respectively 
if) and 9, tested at their annual meeting with the Group Rorschach. The 
general adjustment scores of the scientists averaged 7.7, of the technicians 
9-1, not a significant difference and both below 10, which may tentatively 
be considered the critical score for maladjustment. The three best- 
established scientists made an average adjustment score of .j. Unlike the 
artists, these two groups were found to be quite homogeneous in per¬ 
sonality patterns. Both groups tended to give whole responses, but a^ 
would be expected those of the scientists were superior in quality to 
those of the technicians, in keeping with their higher mental level. 
Unlike the artists, the scientists as a group gave only one sex response 
(several later said they had consciously suppressed these), while the 
technicians gave more than the average number of human anatomical 
and sex responses. The scientists gave an unusually large number of 
animal anatomy responses, as might be expected in a group of men 
whose work involves spending hour after hour with bones; the tech- 
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nicians gave fewer, perhaps reflecting less absorption in their work 
than their professional counterparts. Roe classified such responses as 
“technical”, because appropriate to the profession; that medical students 
give more anatomical responses than mcn-in-general (346) is further 
evidence of the effect of interest and experience on test scores. The most 
striking finding was the very small percentage of human movement 
responses, considered indicative of creative imagination: the group 
appears thus to have a decided tendency to leact objectively to the outer 
world, to avoid projecting themselves into situations and structuring 
them in terms of their own needs. It is also interesting to note that 
the three most successful men have what would be considered sufficient 
movement responses, that is, enough creative ability to rise to the top 
of their profession, in which the more completely objectively minded 
worker normally does well. Color shock, or inability to handle color, 
which is considered indicative of inability to handle social relationships 
effectively, was also common in this group of men whose work permits 
them to live in relative isolation and carries few social obligations. Roe 
1 elated her findings to a study of Munroe’s (553) with college girls, 
which suggests that the personality patterns and vocational relationships 
indicated here may exist befoie entry into occupations. In view of this, 
her summary concerning this gtoup of scientists takes on especial sig¬ 
nificance: 

. . . the men who follow this vocation show, as a group, certain definite 
characteristics of personality structure. They tend to abstractions, to formalized, 
objective, thinking, with a marked inhibition of any tendencies to project 
themselves into a situation. They empathize little, either with things or with 
other people and they have a rather passive emotional adaptation. There is 
further indication that within this group, those who have been able to maintain 
objectivity and at the same time not inhibit creativity, those who can to some 
extent at least project themselves, are the ones whose work is most broadly 
theoretical and most widely significant. Caution because of the small sample 
should be invoked here, yet the indication is entirely logical. (636:326) . 

The italics are the writer’s, for the results of a highly subjective test, 
based on groups of 16 and 9 persons, with no adequate controls, can 
be considered no more than tentative. But they are the most challenging 
of any study so far made of the relationship between personalitv and 
vocational choice, and indicate that the technique should be further 
developed and that other groups should be studied with it, in order to 
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add to our knowledge ol vocational psychology and to the tools of 

vocational guidance and selection. 

Use of the Rorschach Inkblot Test in Counseling and Selection. No 
attempt has been made to assess, in this section, the validity of the 
Rorschach as an instrument for the clinical study of personality, although 
it is obvious that such validity would be helpful in counseling and 
e\ablation because of the insights it would give into the types of ad¬ 
justment problems an individual might encounter and the amount of 
difficulty he might have in handling them. The making of such an 
assessment would requite more space than is wan anted in am thing 
other than a textbook of clinical ps\cliometrics. Attention has been 
limited to the relationship between Rorschach scores and vocational 
choice and success. 

The studies so far completed show that no relationship has been 
found between Rorschach indices and vocational success, although one 
study now in progress appears more likely to yield positi\e results. 
Studies ol occupational difletences ate by no means conclusi\e, in one 
instance because of excessive overlapping of gioups despite significant 
differences and because of differences between employed and student 
groups, and in another because failure to find homogeneity in the 
occupation is questioned by the absence of a control group; in a thud 
group the numbers ate so small and conttols so lacking as to necessitate 
drawing only the most tentative conclusions front what is otherwise a 
most revealing and challenging study. 

In view of the above, the Rorschach can be considered only an 
instrument which may be worth using in validation studies, as one which 
research may yet prove extremely valuable in vocational counseling and 
selection, but about which too little is now known to justify its use in 
practical personnel work. 

The Min my Thematic Apperception Test (Harvard University Pre ss, 
1935, 1 <)4T Grune and Stratton, 19^9) 

This projective technique is, even more than the Roisc hach, a clinic al 
device rather than an objective test, and one the occupational significance' 
of which is unknown. It is briefly described here for two reasons: it 
has so challenged the interest of test users that questions concerning it 
arc common, and it has promise as a research technique for the study 
not only of personality adjustment but, more specifically, of the deter¬ 
minants of vocational choice and satisfaction. Unlike the Rorschach, 
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it is not a measure of the structure or organization of personality, but 
lather a technique designed to bring out the content of the personality, 
the needs, strivings, and environmental pressures which are felt by the 
person being studied. This fact might lead one to question its potential 
value as a device lor use in directional vocational counseling or selection, 
were it not that the needs or strivings which it reveals may well be the 
determinants of vocational choice and vocational interest. 

Description. The TAT, as this test is generally called, was designed 
for use with older adolescents and adults, but pictmes have since been 
added which make it administrable to older children and younger 
adolescents: the examiner merely selects the appropriate pictures. As 
most ot the studies made with it have been made with the older group, 
however, more is known about its scoring and interpretation at that 
level. T he test consists of a seiies of 20 pictmcs for a given age and sex 
group. T he pic tines are scmistructured, that is, their content is more 
like a specific object or scene than is the content of an inkblot or a 
cloud picture, but expressions are sufficiently ambiguous and action 
pool lv enough defined so that it is possible for the subject to projec t 
himsell into the situation and shape it somewhat according to his own 
nerds and leans. T ims one sce ne depicts a human figure seated or kneeling 
next to a seat, a small object on the floor or ground before him, head 
bent and face* hidden. T o one pci son this figure represents a boy who 
has just broken his mothei \s favorite vase, at the remnants of which he 
is stai ing; to anothei, a gii 1 who has just shot her lover and, dropping the 
pistol in front of her, is overwhelmed by her deed; to someone else it is 
a young man, i'ondh gazing at a flower given him that night by his sweet¬ 
heart. Each person sees what he needs or wants to see in such a picture. 

The test is administeied individually, sometimes the examiner, and 
sometimes the examinee, writing down the examinee’s story of how the 
scene came about, what is going on at the moment, what the characters 
feel, and what the end result will be. Scoring methods vary with the 
objectives of the examiner, and might better be called interpretive 
methods, for they are neither objectively based nor objectively expressed. 
Instead, the examiner analyzes the content in order to determine the 
underlying themes (hence the name of the test), to ascertain whether or 
not the plots are happy, logical, probable; and to find out with what 
kinds of heroes the subject identifies himself and the forces to which 
he feels subjected. The manual presents a somewhat more quantitative 
but time-consuming method for obtaining a weighted count of the needs 
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(e.g., abasement, aggression, dominance) and forces (e.g., affiliation, 
aggression, loss) affecting the hero or examinee, a scheme useful when 
research is being conducted in group differences or in relationships be¬ 
tween test and criteria. The norms in this scoring system consist of the 
responses of normal college students; in certain other methods there 
are none, and the data are used simply as clinical or case-history material 
to be interpreted in the light of other personal data to make a dynamic 
and meaningful picture of an individual. It is obvious that, like the 
Rorschach, this test can be used only by well-trained and experienced 
clinical psychologists. Beliak ((i.|) and Tompkins (850) have published 
scoring aids and manuals, each differing from the others in important 
aspects. 

Validity. The most intensive clinical validation of the technicjue is 
reported by Murray (557) in a study of Harvard undergraduates, which 
showed a high degree of consistency between TAT and other clinical 
evaluations made independently. Harrison (344) found that conclusions 
based on it agreed well with case-history material and psychiatric diag¬ 
noses in a mental hospital. As the question of clinical validity is not 
one of primary concern in this context, however, these and 1 elated 
investigations will not be gone into in any detail: it is important onh 
that there are some indications of validity in what is still a clinical device 
which seems likely to develop into a test. 

Occupational differences in 'FAT patterns have been touched upon 
by Roe in her study of the personalities of artists (635), the one published 
study in which the test has been applied to occupational groups (Neal 
E. Miller, John L. Wallen, and the writer used it with aviation cadets 
during World War II, for a clinical study of success and failure in flying 
training in which all data were merged to yield a dynamic picture of 
each cadet rather than to reveal group differences in test scores). Roe 
found the test difficult to administer to her 20 artists, as they were so 
critical of the artistic quality of the pictures that they found it difficult 
to focus on the telling of a story. Interpretation of results was made 
correspondingly difficult. The content of the stories was not unusual: 
a tendency toward feminine and nonaggressive identifications was noted; 
otherwise there is little that seems significant in the data, of which Roe 
does not seem to have pushed the analysis as well as she did that for the 
Rorschach. 

Use of the Thematic Apperceptio?i Test in Counseling and Selection. 
This brief account of the TAT has attempted to make clear its embry- 
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onic status and at the same time to suggest its promise as a device for 
measuring, more subtly than any personality inventory, the needs which 
drive people and the forces which they feel pressing upon them. Although 
virtually no use has been made of the instrument for vocational coun¬ 
seling or selection, and none should at present be made, the technique 
is one which should be developed to a point which will make it useful 
in studying the needs and drives which are related to vocational choice 
and success, and for ascertaining the relationship between these and 
satisfaction in various types of work. It would be helpful, for example, 
to know that the need for winning affection is more often satisfied in 
social work or in teaching than in medicine or law, and to have an 
objective method of measuring that need. Such developments in the 
TAT are remote, but they are mentioned in the hope that research will 
be prosecuted which will bring them about. 

Trends in the Measurement of Personality. 

Perhaps the major trend in the development of instruments for the 
measurement of personality during the past 20 years has been one away 
from the inventory technique and toward various projective devices, 
illustrated by the disfavor with which personality inventories arc gen¬ 
et ally viewed and by the rapid growth in popularity of the two best- 
known but complexly scored projective tests. At the same time there 
has been a minor trend of considerable importance, an interest in the 
refinement of inventorying techniques, illustrated by the publication of 
factor-analysis-based forms such as those by Guilford (318), Darley and 
McNamara (192), and others, and by the empirically based Minnesota 
Multiphasic (353). 

Both trends can be traced to the low validity which has been seen 
generally to characterize personality inventories. In the first it caused 
a search for a more subtle and penetrating type of test which would 
probe underneath the sophistications and rationalizations of the subject 
in order to get at the structure and content of his personality; in the 
second it resulted in a greater emphasis on purity of factors in some 
inventories and on empirical weighting in others. The impro\ed 
personality inventories seem to this writer to be better stopgaps 
while the subtler projective techniques are being objectified and 
validated. 

These trends have manifested themselves in other ways which have 
as yet made less impact on applied psychology, but which should be 
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familiar to the practicing vocational counselor or personnel worker, and 
which warrant experimentation by personnel psychologists. These will 
be very briefly described in the following paragraphs. 

Custom-Built Personality Inventories. A series of standard personality 
inventories, including the Bernreuter, Adanis-Leplev, 1 lumm-Wadsworth, 
Minnesota Multiphasic, Minnesota Personality, and the several Guilloid 
scales, were administered to aviation cadets who later went to pilot¬ 
training schools during World War IT. As reported in various olTicial 
bulletins and summarized by Guilford (316: Ch. 23), none of these had 
any validity for pilot selection. 

The Shipley Personal Inventory , developed for wartime use by the 
Office of Scientific Research and Development (563,561), also had no 
\alidity for pilot selection (316:004-007), although it did ha\e validity 
for certain other types of military selection and for screening combat- 
fatigue patients (925:115-121). It is of interest because of its cflectiu* 
use of the forced-choice technique, in which the subject must choose 
between two sometimes innocuous but often offensive self-descriptive 
items. 

The Satisfaction Test (801:316:736-7.15) was one personality inventory 
which did have some validity for pilot selection. It was developed bv 
Robert R. Blake, John L. Wallen, Joseph Weit/, and the writer with 
the specific conditions of military life and wartime flying as their content; 
for example: 

If given the choice and having equal opportunity and ability, would 
you rathci 

55. A. ambush the enemy? 

B. storm an enemy position? 

The keys were empirically developed, on the basis of item validation 
against success in training. They were twice validated and cross-validated 
on groups tanging in size from 800 tea 2000 cadets, and each time had a 
low but statistically significant validity of about .20. This would not 
have been sufficient to justify using the test, had it not been that its 
extremely low correlation with the selection battery made even this 
validity a unique addition tea the predictive value of the battery. Con¬ 
clusions drawn from this study have been summarized as follows (801: 

744): 

1. When a valid battery of aptitude tests has been developed and new 
aptitude tests are found merely to measure the same thing in different 
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ways, thereby adding little to the validity of the existing battery, 
personality inventories may be worth considering. 

2. In such a situation, the personality inventory may have low validity, 
both absolutely and relatively to the aptitude tests, but, if the relation¬ 
ship to the criterion is significant, it will have a unique contribution to 
make to the battery. 

3. Standaid personality inventories are less likely to be valid, because 
of their general teims and situations, than custom-built inventories based 
on analyses ol the behavior and attitude-evoking situations in the 
\ocalion or in the employing organization. 

Empirically \alidated sue ccss-lailm e keys, checked against the logic 
ol the situation and ol the item, are likely to prove more valid than keys 
based on clinical judgment or on an internal rather than external index 
ol validity. 

Situation Tests. The situation test is one in which the examinee is 
put in a partly rearranged but real-life situation and his behavior noted 
and anal)/eel. I he technique was lust developed by German psychologists 
(2 15), was experimented with in the selection of reserve officers at Ilar- 
\aicl under Murray, and was used extensively by the Office of Strategic 
Services under Murray’s direction during World War II (33,558). It was 
relied upon (here, despite its cumbersomeness and lack of proved validity, 
because it was I cl t that the screening of superior men and women for 
confidential assignments in which effective social relations, leadership, 
and disc let ion were vital could not be better done in any other wav. 
The tests were administered during a sort of house party in which 18 
candidates, seven psvchologists, psychiatrists, and sociologists, and eight 
junior psvchologists participated lor three and one-half days. A variety 
ol standard tests, performance tests, projective tests, interviews, psycho- 
drama, and casual observations were used, but the techniques of interest 
in this context are a series of leaderless-group-situation and individual 
situation tests. In the Wall lest, for example, six examinees were as¬ 
signed the task ol getting a heavy eight-loot log over two parallel ten- 
loot walls, set eight feet apart, without touching the ground between the 
walls: this gave opportunities to observe leadership, social relations, 
initiative, practical problem-solving ability, etc. In the Construction Test 
the candidate had to build a live-foot cube with a glorified tinker-toy, 
aided by two helpers whom he was to direct; the helpers were junior 
psychologists whose task it was to turn lazy, recalcitrant, and insulting in 
order to test the candidate’s frustration tolerance; the task was never com- 
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pleted and some candidates became cither very upset or enraged by their 

humiliations. 

Observation during these situation tests made it possible to rate can¬ 
didates for emotional stability, social relations, energy and zest, leader¬ 
ship, security, and other traits, and a staff conference synthesized the 
findings into a job-fitness rating and an evaluation note. The criteria of 
success used were far from perfect, but the validity of the total procedure 
appears to have been higher than .45. No data are available to show the 
validity of any one test used in this program, although that would be 
essentia] to the evaluation and improvement of the procedures. 

A group of somewhat similar devices was tried out in the Aviation 
Psychology Program of the Army Air Forces (316:f>5 (i-G (> 9; 7 9 7:554-555), 
but in the regular testing of 400 cadets per day rather than in the inti¬ 
macy of a week-end house party for a score of men. The test situations 
were the Observational Stress Test developed by Glen Heathers, the 
Observations During Rest Period devised by the same psychologist, 
and the Interaction Test planned by the writer. In the first the examinee 
was rated for promise as a pilot on the basis of his observed reactions 
when presented with a confusing multiplicity of stimuli while manipu¬ 
lating the controls of an airplane; in the second similar ratings wcie made 
while the cadet waited in a 100111 furnished with a bomb, a twisted piece 
of fuselage, and other reminders of the dangers of military flying; in the 
third, the same type of rating was made (by a diflercnt psychologically 
trained enlisted man) while four cadets jointly assembled three Wiggly 
Blocks, a situation calculated to provide some opportunity to reveal 
leadership, ingenuity, and ability to co-operate. Only the ratings based 
on the Observational Stress Test and the Interaction Test had any va¬ 
lidity when correlated with success in flying training, and these were so 
low as to be doubtful. In addition to the overall ratings of pilot promise, 
specific ratings of co-operation, leadership, emotional stability, and other 
traits were made during the Interaction Test, but these validities also 
were negligible (rbis ranged from —17 to .13). 

Perhaps the above-cited data only demonstrate that these specific 
forms of the situation test have no predictive value for this one type of 
behavior, success in flying training. But other measures had validities 
ranging up to .51 (214:191) for this type of behavioi. It therefore seems 
clear that each such test should be specifically validated for the type of 
behavior it is intended to predict, until the remote day comes when they 
have been demonstrated to be such pure measures of traits, the respective 
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significance of which has in turn been proved valid lor each specific 
type of vocational activity, that it is sale to generalize fiom test to occu¬ 
pation without a correlation coefficient to justify the prediction. 

Situation tests therefore appear to be promising techniques for the 
study of personality with its vocational implications, but their demon¬ 
strated validity is not at present such as to justify their use on any basis 
other than that of clinical intuition. The underlying logic and their face 
\alidity suggest, however, that they should be experimented with in per¬ 
sonnel selection programs and their validity established, particularly for 
positions of leadership and responsibility. 

lrn omplete Sen ten (es Tests. In this technique the examinee is pre¬ 
sented with a list of incomplete sentences or stimuli such as “I wish . . . 
“My boss “The work I do . . . and “My mother . . The 

specific stimulus phrases \ary with the purpose of the test, with the 
attitudes and traits which it is desired to assess. They have been experi¬ 
mented with by Payne (595), Thorndike and Lorge (484), Rohde (642). 
Sanford (GOG), Tandler (S17), Rotter and Willerman (655), and in ail 
unpublished study of commercial airline pilots by Hobbs; the technique 
originates in the intriguing woi d-association technique developed by 
Jung, experimented with lather fruitlessly by many investigators (810) 
and most recently teviewed In White (921). The special advantage of 
this open-end or sentence-completion technique is the freedom which it 
leaves the examinee to teveal his ttuc feelings by the way in which he 
structuies a semistructmeel situation. This complicates scoring, but 
devices are being experimented with lor the categorization of responses 
in such a way as to make possible the rapid classification and scoring of 
the completed sentences. Although there is little evidence as yet, what 
there is seems to suggest that the technique may develop into a method 
of measuring attitudes and needs which is more subtle and more valid 
than the attitude or personality inventory. If so, it may prove useful in 
vocational counseling and personnel work when pioblcms of job satis¬ 
faction and morale are likely to be important, and also in screening 
maladjusted persons for clinical counseling. 



CHAPTER XX 

APPRAISING INDIVIDUAL 
VOCATIONAL PROMISE 


Preliminary Considerations 
Focus on the Individual 

IN THE early chapters our attention was focused on the logic and 
steps ol test const) union and validation, on the nature and occupational 
significance of a variety of aptitudes and traits, and on instruments lor 
their measurement. As pointed out in the introduction, this locus was 
chosen because in actual work with tests one begins pci tone with a 
test result, and proceeds to study the significance of that scene lot the 
occupational plans of the* person being counseled. Intimate knowledge 
of the construction and \aliclation ol each test used is essential to test 
selection and to test interpretation. But in appraising indhidual \o- 
cational promise, whether in a counseling or in a personnel capacity, 
there arc other steps which precede and follow the selection and inter¬ 
pretation of tests. When the focus is on the individual rather than on 
a test the perspective changes and other considerations come to the fore. 
For these reasons it is the purpose of this chapter to consider the* use* ol 
tests in appraising individuals. 

"What is said here does not bear on the work of the psychologist or 
personnel man who is using tests mechanically in large-scale selection 
programs; in such work the* procedures are those of test development, 
described in another chaptei; test interpielation is then simply the* 
statement of chances of success as expressed in a numerical score. For 
example, it was ascertained through test validation that an aviation 
cadet with a pilot staninc of 9 had 84 chances in 100 ol be ing successful 
in flying training, whereas a cadet with a staninc of 1 had only 19 chances 
in 100 of succeeding in fixing training (214:145). In such operations there 
is neither a problem of selecting appropriate tests nor one of synthesizing 
the results and evaluating their significance for a given individual, tor 
test selection has been taken care of in the test development program, 
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and synthesizing results and teasing out meaning has been taken care 
ol by the validation and storing processes. For a more detailed discussion 
of statistical interpretation, see Appendix A. 

But the material of this chapter is of importance to the worker who 
must operate without extensive previously validated test batteries. It 
is important to most persons working in small organizations, with small 
departments, with executives even in large oiganizations, and as private 
consultants. Many ol the applicants appraised in these situations are 
considered foi positions which have not been thoroughly studied with 
tests, and which sometimes cannot be so studied in time to help with the 
solution ol immediate problems. In such instances the user of tests must 
opciate mote as a counselor or clinic an, bringing together bits of infor¬ 
mation about tests and about jobs in order to make the best possible 
appraisal. 

The material in this chapter is c*\en more important to the vocational 
counselor whose function it is to help liis clients to obtain the most 
accuiate possible picture of their abilities and interests in relation to 
occupational oppoi (unities. In such work the counselor usually lias to 
help the client do what he should have been doing for some \eats pre- 
sious: teview his school, leisuie-time, and work experiences in order to 
understand what they reveal concerning his vocational abilities. As 
pointed out in an earlier chapter, vocational appraisal in counseling 
olten requites the anahsis of a much greater number of abilities, and the 
consideration of the requirements of a much greater variety of occupa¬ 
tions, than does appraisal in selection work. Needed occupational not ms 
ate often not available, and those that might be used are olten for 
populations of such specialized characteristics as to make generalization 
to other seemingh related occupations a questionable procedure. The 
use of tests by a \ocational counselor is therefore of necessity generally 
not a predictive process but rather a clinical pioccdurc. A variety of data 
have to be studied in re lation to each other, and hypotheses are estab¬ 
lished for the consideration ol the client. It should be noted that the 
term hypotheses is used, rather than conclusions, as their bases are not 
definitive enough to warrant the term conclusion. The client decides 
which hypothesis seems most likely to him, aided by the mature experi¬ 
ence and accepting attitude* of the counselor, and proceeds to test it by 
embarking upon an appropriate plan. This plan is subject to review and 
revision on the basis of subsequent experience, either with the continuing 
aid of the counselor or by the client alone. 
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Selecting Appropriate Tests 

When utilizing psychological tests for the appraisal of vocational 
promise, the first problem with which one is confronted is that of the 
selection of tests suitable to the person and purpose at hand. Until all 
people have a uniform cultural and educational background, and a 
standard battery has been developed and validated for a great many 
occupations (should that time ever arrive!), this is no mean problem. At 
least four considerations must be kept in mind in making the selection. 

The person to be tested must be understood. The psychornetrist or 
counselor selecting the tests must know certain obviously important facts 
such as age, amount of previous education and approximate intellectual 
level. All of these affect, for example, the choice of the Kudcr or the 
Strong interest inventories; age and intelligence, the choice of the 
O’Rourke or Bennett mechanical aptitude tests. As has been well dem¬ 
onstrated by the investigations of social psychologists interested in race 
differences and by the expelicnce of \ocational counselors working with 
refugee groups, the cultural background of the client is equally impor¬ 
tant. Even when there are no language differences, differences in experi¬ 
ences peculiar to a sub-culture can affect the appropriateness of a test. It 
has been found, for example, that in a picture-completion test standard¬ 
ized on American children and depicting, among other things, a boy about 
to kick something which he has just dropped from his extended hands, 
Scottish children often make the mistake of giving the boy a pumpkin to 
kick instead of the oval football. The reason is clear: their football game 
is soccer, in which a round ball is used, and they are familiar neither with 
oval balls (despite English Rugby-football) nor with pumpkins. Some 
of our other tests, designed by and standardized upon residents of the 
Northeastern and Middle Western states, arc not fully applicable to 
those who reside in other parts of the country. 

The purpose of testing must also be clear to the selector of tests. Is 
the objective a survey of the abilities and interests of the client, in order 
to ascertain which areas might profitably be explored either by tests 
or by life experiences? If so, a combination of tests which tap a number 
of fundamental abilities and interests is desirable, even though occupa¬ 
tional norms may be defective, for the important thing is to locate 
strengths for further study. Or is the aim to make an intensive analysis 
of some one or two areas, in order better to understand and evaluate the 
possibilities of assets already known to exist? In this case, a number of 
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tests measuring varied aspects or manifestations of the same aptitudes 
may be desirable, to make possible a detailed study of an area. For 
example, a number of tests of manual dexterity may be used to determine 
just what type of hand-and-finger operations the client performs with 
the most skill, or several interest inventories may be administered so 
that discrepancies between patterns on tests constructed in different 
ways and using different types of items may suggest special outlets to be 
avoided or sought. 

Whether the testing is to aid in guiding development over an extended 
period or to help in making an immediate decision is another aspect of 
the purpose of testing. A young man who has left school with no intention 
of continuing his education but who wants to get started in a field in 
which he may be able to learn and progress on the job is in quite a dif¬ 
ferent position fiom another who expects to go to college and wants help 
in deciding what to major in and at what field to aim. Directional guid¬ 
ance is sufficient for the latter case, and this calls for a variety of tests and 
inventories in order to check the level at which he may work and to point 
out occupations which he may do well to explore in courses, extra-cur¬ 
riculum, and summer jobs. But, reluctant though one may be to work in 
such a way, the case of the young man deciding on an entry occupation 
requires careful study of qualifications for immediate employment. The 
study must cover previous experience, and that is often very helpful; but 
in other instances test results are the most tangible and clear-cut guide 
available. The battery of tests must therefore be one which throws direct 
light on qualifications for entering at once into any one of several occupa¬ 
tions under consideration. To fail to get all possible meaning from tests 
in such a case is to leave the making of the decision largely to chance. 

The x’ocational aspn at ions of the client are a third factor determining 
the selection of tests. The psychometrist or counselor must know not only 
the background of the client and the nature of the service being rendered, 
but also the ambitions or goals which the client has in mind. He must 
know what educational and occupational level he hopes to attain, as some 
aptitudes are more important at some levels than at others (e.g., clerical 
perception); he must also consider what type of occupation the client 
hopes to enter, as that will help him decide how fully to test in special 
areas such as the technical and linguistic. 

Test data constitute the last type of information necessary in the selec¬ 
tion of tests for use in counseling. Knowing the client’s status and goals, 
one must choose tests which have appropriate contents and norms, which 
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arc known to measure traits relevant to the choices in question, which 
measure these reliably, and which can be administered and scored in the 
time available. There should be no need to elaborate on these points in 
such a treatise as this. 

Three Methods of Vocational Diagnosis 

Historically and currently, there are three methods of appraising the 
vocational promise of an individual with the aid of tests: one is clinical 
and two aie psychometric. Their fundamental differences lie in the way 
in which tests aie used. In the clinical method the results of each test art- 
viewed singly and in relation to other tests and to personal and social 
data. All of these are weighted mentally, and a subjective judgment is 
made on the basis of this weighting. In the psychometric profile method 
test scores and other quantifiable data arc compared with occupational 
norms, as when an individual’s test profile is plotted and visually matched 
with those of various occupational groups to asm tain which he lesembles 
most clearly. In the psychometric index method quantification is can ied 
one stej) further to permit the expression of the individual's summarized 
test scores in one total score or index. This shows how he compares with 
members of the occupation in question. Thus in the Aviation Psychology 
Program of the Army Air Forces the scores of each cadet were statisticallv 
weighted and combined to yield three scotes or stanines which expressed 
his standing as a prospective pilot (the pilot stanine), navigator (the nav¬ 
igator stanine). and bombardier (the bombardier stanine). These proce¬ 
dures are discussed at some length in the following sections. 

The Clinical Evaluation of Test Data 

The clinical method of evaluating test scores was the first to be used in 
vocational counseling, because occupational data were not available to 
make possible- the psychometric methods. It has not often been described 
in the literature, perhaps because its very subjectivity makes it diflicult 
to describe; a good recent discussion of test interpretation has been pre¬ 
pared by Harmon (,332), emphasi/ing the profile method but including 
the clinical. Its advocates arc many, and there arc many who claim that 
it is not only the first method to have been used but also the ultimate 
method, to which all will turn when the defects and limitations of the 
psychometric methods are more clearly understood. This argument is 
met with the reply that, as psychometric methods improve, more factors 
will be more adequately taken into consideration and judgments made 



APPRAISING INDIVIDUAL VOCATIONAL PROMISE 


537 


subjectively by the counselor will be made objectively by psychometrics. 
T he reasoning underlying this statement is that anything that exists can 
be measured, and that any relationships which exist can be quantitatively 
expressed: if the clinician can do it, science can do it more accurately. 
The writer is inclined to agree with this latter position, but to recognize 
also that science must make a good deal of progress before all significant 
factors and relationships can be quantitatively measured and expressed. 
For this reason the clinical method of test interpretation is of great prac¬ 
tical importance and should be adequately described. 

The objective of the clinical method is to describe the individual in 
dynamic terms, in the expectation that a good picture of the person will 
make possible inferences concerning occupational success and satisfaction. 
The undeilying hypothesis is that genuine understanding of a person, 
combined with insight into a situation, permits one to foresee the interac¬ 
tion of lorces and predict the outcome. More humbly and accurately put, 
they permit one to set up hypotheses concerning the probable outcomes. 
Even when stated in these terms, it is clear that the clinical method takes 
on no mean job and makes claims as great as those of the psychometric, 
perhaps even greater, for the best psychometric predictions are made with 
full consciousness of the limited basis upon which they are founded, 
whereas the clinical method attempts to take into account all that is 
ie\elant. It puts great weight on the training, insight, and objectivity of 
the counselor. 

In the writer’s experience as a counselor, supervisor of counselois, and 
counselor-trainer, there' have seemed to be three principal techniques of 
ultilizing the clinical method. These are the case conference, discussion 
with the client, and the preparation of psychometric reports. Separate 
( hapters are devoted to the last two topics, so they will be only briefly 
mentioned heic. 

In the case conference the test scores are presented so that all in attend¬ 
ance may see them, generally on a blackboard. Sometimes they arc simply 
listed, and sometimes they are plotted in graphic or profile form. The 
counselor orally summarizes the background information, giving the stall 
an outline of the socio-economic status, education, previous experience, 
inteiests, aspirations, and presented problem of the client. The counselor 
or psychometrist then review's the test scores, commenting on any observa¬ 
tions made during testing that may add to the data. The case is then 
thrown open for factual questions, after which members of the case con¬ 
ference raise questions of interpretation, propose interpretations of their 
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own, and make suggestions for further investigation or counseling. At the 
close of the conference the counselor or chairman summarizes the discus¬ 
sion, perhaps attempting to present an integrated picture of the case as 
seen by the conference. The focus may be on diagnosis, but in practice it 
generally includes also the nature of the counseling and the resources 
which may be utilized in implementing the counseling. 

Case conferences such as these are unfortunately rarely held in service 
agencies other than hospitals and special institutions, largely because of 
the amount of rime they require. They are common in training situations, 
whether academic or institutional, and some service agencies make a 
practice of holding them occasionally as an in-service training or super¬ 
visory device. They have a number of advantages as a clinical diagnostic 
technique: 1) they utilize the insights and resources of more than one 
counselor; 2) they are a safeguard against blindspots and biases; 3) they 
force the crystallization of ideas which might otherwise not be made 
clear and concrete. 

Discussion with the client resembles the case conference as a technique, 
but with the important difference that at least one of the discussants is 
untrained in the use of tests and is emotionally involved in the proceed¬ 
ings. Despite these facts such discussion does a great deal to clarify the 
counselor’s thinking about the significance of the test scores, partly be¬ 
cause of the freshness of another person’s point of view, and partly because 
the opportunity to think out loud brings ideas to the surface. Further¬ 
more, the client’s reactions to the data and to the counselor’s tentative 
interpretations (often put in the form of a question beginning with 
“Could that mean . . . ?’’) provide a healthy corrective for the counse¬ 
lor’s own possible biases. This procedure is discussed at great length in 
the next chapter, from a somewhat different viewpoint. 

The preparation of test reports is perhaps the commonest and best 
technique for the application of the clinical method of test interpretation. 
In writing up the results of testing the counselor not only expresses the 
test scores in verbal form, but discusses the significance of background 
data, observed behavior, and client attitudes and statements for the inter¬ 
pretation of test scores, relates test scores to each other and to these non¬ 
test data, and draws conclusions concerning the true characteristics of the 
individual being studied. These are then related to each other in a final 
summary or thumb nail sketch of the client as seen through the inter¬ 
preted test data. This process, like the others described, forces the coun¬ 
selor to crystallize his ideas and to justify his interpretations, at least in 
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his own eyes and in the eyes of any potential reader. It thus ensures more 
thorough exploration of the data than would a mere mental interpretation 
of test scores, and provides something of a safeguard against the indul¬ 
gence of bias and the riding of hobbies. This technique also is treated 
at greater length and for a different purpose in a later chapter. 

The picture of a person obtained by the above methods is probably as 
adequate as any. The interpretation of test data and case-history mate¬ 
rial, if the data themselves are skillfully obtained, is in fact the only 
method available for the psychological description of an individual. But 
Iron) the point of view of vocational counseling, the defects of the clinical 
method are two: i) the evaluations or judgments made are subjective, so 
that even a group of experienced counselors may be wrong, and 2) the 
best techniques for describing the psychological characteristics of an in¬ 
dividual may be lacking in data concerning their occupational signifi¬ 
cance. 

In the occupational applications the judgment of the counselor again 
becomes of fundamental importance. One might cite the O’Connor 
Tweezer Dexterity Test as an example, interpreted for years as a measure 
of significance for success in dental school, but shown by the majority of 
studies to have doubtful validity for that purpose. Or reference might 
be made to Thurstone’s work with primary mental abilities tests, which 
there is every reason to believe measure basic human aptitudes but which 
have not yet been actually demonstrated to have occupational significance. 
Several wartime aviation psychologists were certain that they could make 
better predictions of success in flying training by clinical interpretations 
of test data than were provided by the objectively obtained stanincs, but, 
either because of the inadequacy of some of the tests used or because of 
their lack of knowledge of flying, or both, their predictions had no valid¬ 
ity (3i():66q;G 16:797). As it is known that many instruments are good 
measures of psychological characteristics of one kind or another, and 
relatively few have been validated for many occupations, it is probably 
in the making of occupational applications that the clinical method 
makes the giavest errors. It is one which should be used only by counse¬ 
lors who have acquired both an intimate knowledge of tests and an even 
greater fund of information concerning occupational activities and ie- 
quirements. 

Methods of drawing on a general fund of occupational information for 
the clinical interpretation of vocational tests, and of adding to that fund 
when it is not sufficiently great or detailed, deserve some mention: they 
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arc even less frequently treated in the literature than ait* methods of 
analyzing lest results in older to prepare a psychological sketch of an in¬ 
dividual. They amount to the making of a job analysis by the psychome- 
trist or counselor. If he has a good fund of vocational information the 
job analysis is of the armchair variety: the counselor mentally reviews 
the functions, duties, and tasks of workers in the occupation, and makes 
deductions concerning the aptitudes and traits which seem to be required; 
he checks these deductions against what he knows of the published 
material on test 'validities and against the expressed opinions of others 
who are familiar with the woi k in question. The list ol characteristics 
thus drawn up in his mind, and perhaps put on paper, serves as a guide in 
considering the client’s qualifications for woi k ol the type in question. 

If the counselor lacks sufficiently detailed information concerning the 
occupation in question, the job analysis must be made fiom a vantage 
point other than the armchair. The first step may be familiarization with 
printed material in the form of occupational and industrial descriptions 
such as arc* listed in Sliartlc (714) and in Forrester (263). but such data ate 
often too general to provide the insights needed into the aptitudes and 
traits which make for success on the job. The counselor then needs to go 
to the job itself, observing workers in action, familiarizing himself with 
the knowledge, tools, processes, and problems of the occupation. This 
takes time, but it is the accumulation of information acquired in such 
first-hand contacts with vocations and workers which distinguishes the 
vocational counselor from the clinical psychologist. The latter knows 
diagnostic and counseling techniques, and has insight into the dynamics 
of human adjustment, but unless he has had a great deal of contact with 
workers and has studied their work he is not qualified to do vocational 
counseling. The techniques used in these field studies and observations 
are ol course the standard techniques of job analysis as used in the pre¬ 
liminary woik of test development. Shartle (7 1 .j) has described them in 
his text on the collection and organization of occupational information. 

The Psychometric Profile Method 

The first attempts to objectify the clinical method of test interpretation 
consisted of administering batteries of tests to persons in a variety of 
occupations in order to ascertain the nature of the patterning of test 
scores. This was the method developed by the Minnesota Employment 
Stabilization Research Institute (223,589), which used a standard battery 
of intelligence, clerical, mechanical, spatial, and manual dexterity tests, 
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administering this battery to groups of clerical workers, department 
store clerks, policemen, janitors, accountants, casual laborers, and others. 
The mean scores made by each group on each test were ascertained, and 
a profile plotted for each group, as shown in Figure 8. This made it 
possible to give the same battery of tests to a client, and to compare the 
patterning of his scores with that of accountants if he aspired to be an 
accountant, or to the patterning of the aptitudes of policemen if that was 
an occupation to be considered. 

Md. 
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The technique bad a number of serious limitations, of which its origi¬ 
nates s were well aware. One was the limited number of occupations for 
which patterns could be obtained; this was in part remedied by the gen¬ 
eralizations cal experts in the Minnesota Occupational Rating Scales (r^qi). 
Another was the difficulty of deciding when an individual's profile differed 
significantly from that of an occupational group, discussed in connection 
with the USES General Aptitude Test Battery; this was then remedied 
only by the judgment of the counselor, making the method partly clinical 
in nature. A third was the limited number of characteristics appraised by 
the test battery and included in the profile; this also had to be remedied by 
the counselor’s clinical skill and occupational knowledge. As in the case 
of the clinical method, too often counselors have knowledge of tests or 
knowledge of occupations without having both. Finally, the populations 
used to establish occupational ability profiles in the Minnesota project 
were selected as representing the local population, leaving the question 
of their applicability in other localities unanswered. 




542 APPRAISING VOCATIONAL FITNESS 

The Occupational Analysis Division of the United States Employment 
Service carried work with this technique further, partly for selection and 
partly for guidance purposes. In the former program batteries of the most 
valid tests were used in varying combinations to establish profiles for 
each job studied; in the latter, a standard battery was administered to 
persons employed in various families of occupations and patterns of apti¬ 
tudes were ascertained. This work has been described in Chapter 15, its 
outcome being the USES General Aptitude Test Battery. The difficulties 
discovered in the MESRI work were minimized in the USES project by 
classifying occupations in families in such a manner as to make some 200 
profiles represent approximately 2000 major occupations, basing the 
profiles on critical minimum scores rather than mean scores, selecting 
the tests for inclusion in the battery on the basis of a factor analysis of 
vocational aptitudes, and sampling occupations in various key parts of 
the countrv rather than in one or two localities. As was pointed out in the 
discussion of the tests, the battery still has defects, but it represents a 
great advance in the occupational ability pattern or psychometric profile 
method. Its usefulness is limited, however, to the tests used in the original 
battery (not available except to the slate employment services) and to the 
occupations already studied. It makes one further contribution, in that a 
counselor who knows the patterns established by the General Aptitude 
Test Battery, and who has a real understanding of vocational processes 
and requirements, can use this fund of information to provide an objec¬ 
tive foundation for the exercise of clinical insight when working with 
tests and occupations for which occupational ability pattern data are 
lacking. 

The Differential Aptitude Tests of the Psychological Corporation, also 
described in Chapter 15, are another attempt to improve and extend 
the occupational ability pattern or psychometric profile method, although 
to date it has been applied only to school populations. The American 
Institute for Research has in preparation a comparable battery, based on 
the wartime studies of aviation psychologists, and other such batteries 
are also being planned (320). 

The usefulness of this method of appraising vocational promise de¬ 
pends, as might be expected in the case of an empirical method, upon 
the accumulation of objective evidence. It has been seen that only a bare 
minimum of such data are now on hand, enough to reveal the promise 
and the defects of the technique, to provide a concrete basis for the mak¬ 
ing of some decisions, and to make somewhat less intuitive some of the 
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clinical judgments which have to be made when objective data are lack¬ 
ing. It would be pertinent to ask whether there may not be real danger 
of a too mechanical application of this psychometric method once more 
occupational ability patterns are available, for it is certain that no test 
battery in the foreseeable future will be able to measute every trait which 
has a bearing on success and satisfaction, especially when it is remembered 
that some of the factors which determine success and failure are not per¬ 
sonal or psychological, but rather environmental or economic and social. 
But discussion of this question is postponed until the end of the next 
section. 

The Psychometric Index Method 

The* combining of test scores in order to provide a single score or inde x 
of vocational promise has long been practiced, both in the arbitrary 
weighting of scores on the basis of a prion judgments of their relative 
importance in a job, and in the statistical weighting of test scores on the 
bash of their respective correlations with the criterion. This has been a 
selection technique, however, rather than a method of appraising an 
individual for counseling, largely because data were lacking for the sta¬ 
tistical weighting of tests for counseling and because counselors weie 
properly reluctant to give the appearance of objectivity to their judgments 
by arbitrarily weighting the scores and combining them. Perhaps two 
exceptions to these statements may now be made. 

In the Kuder Preference Record (pp. 445 fT.) a series of scores (those lor 
the nine types of interests) can be weighted on the basis of their relation¬ 
ship to membership in an occupation, and combined to show how closclv 
an individual’s interests resemble those of members of that occupation. 
This is a limited application of the technique, both because the scores 
involved represent only interests, and because such occupational indices, 
as Kuder calls them, have so far been developed for only two occupations. 

The stanines of the Air Force’s Aviation Psychology Program (214) mav 
be a second exception. Although these were developed for selection pm 
poses rather than for counseling, the fact that they are available for three 
different flying jobs and are all obtained from the same basic test battery 
means that they could also be used in counseling concerning the choice 
of any one of those three specialities. This too is a limited application of 
the technique, but it illustrates its possibilities. 

The fundamental argument for the use of the psychometric index is 
that it does away with the subjectivity of the profile: instead of leaving 
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to the counselor the making of an overall judgment of the similarity ol 
the client’s psychological characteristics to those of members ol the oc¬ 
cupation in question, this “judgment” is made by an empirically based 
statistical process which is more precise than the subjective judgment ol 
the counselor. Each aptitude and trait is weighted on the basis of its 
occupational significance, and the similarity of the individual to othcis 
"■ho have succeeded in that occupation is expressed by the final score or 
index. In the Air f orce, for example, a cadet with a stanine of 9 is known 
to have aptitudes, inteiests, and temperament very much like those ol 
other cadets, 84 percent ol whom sui<ceded in learning how to lly, while 
another cadet with a stanine ol 1 is clearly shown to have characteristics 
like those of cadets, 81 percent of whom failed to learn to lly in the allotcd 
time (214:1 15). 

As was mentioned in the discussion of the clinical method, this proce- 
dme has sometimes been criticized as too mechanical, as lading to take 
into account the multiplicity cal personal and social factors which allcct 
success and satisfaction. Probably no one would contend that it does take 
all of these into account: its proponents would argue only that what it 
does considei is taken into account in the most accurate manner possible 1 
I he adequacy of that kind ol appraisal is a matter, not lor discussion 
(except lor the establishment of hypotheses), but for experimentation. 

I w'o kinds ol evidence are available, one a comparison ol the eflecti\e- 
ncss of \aiious clinically appiaised test data with that ol mechanic a 11 v 
computed Air Force stanines lor the prediction of success in training, the 
other a comparison of the cllecti\cness of clinically evaluated and me 
chanicallv applied stanines for the same purpose. Both of these are exper¬ 
iments in a selection rather than a counseling situation; unfortunately 
the lack ol psychometric indices fear use in counseling has precluded the 
possibility of making such experiments in counseling programs. 

In the Clinical Tn Iiiikjucs Project of the Army Air Forces, aviation 
psychologists experimented with a number of clinical evaluation tests 
for the se lection and classification of pilots (316: Ch. 24:61b). These tests 
included ratings of prospects of success based on observations made while 
the cade t, responded to a contusing sequence and combination of signals 
in a miniature cockpit, while he worked with three others to assemble the 
parts ol three sets of Wiggly Blocks, and while he sat in a waiting room 
surrounded by odds and ends of wrecked airplanes, bombs, and similar- 
objects; they also included the Group Rorschach, which was scored in 
the usual manner and also evaluated impressionistically to yield a rating 
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of promise* as a flier. As has already been mentioned, none of the tech¬ 
niques had any substantial validity, and those which showed some slight 
promise in the first validation were proved invalid in the cross-validation. 
At the same time, the objectively derived stanines had their usual sub¬ 
stantial correlations with success in pilot training. It should be pointed 
out that this was a very limited evaluation of the clinical method, for 
each clinical evaluation was based on only one source of data, however 
global in approach the test was. Although it had been planned to make 
evaluations on the basis of a clinical synthesis of all data for each cadet, 
this part of the plan btoke down because of the sheer bulk of the data 
to be handled and the impossibility of assigning the recpiired number of 
psychologists to the project over such a long period of time. 

The Sin germ's Classification Hoard provided an opportunity for a 
more comprehensi\c clinical evaluation of cadets being considered for 
Using training during several months in which it was experimented with 
during World War II (described in a military report by W. M. Lcpley 
and II. I). Hadley). The board consisted of a flight surgeon and an 
aviation psychologist, who interviewed each cadet with stanines below 
the recpiired levels for all three air crew assignments (at that time 3 lor 
pilot and bombardier, 5 for navigator). The interviews lasted approxi¬ 
mately eight minutes each, ranging in length from five to twenty minutes. 
A total of 152.1 cadets were interviewed cluiing the six months of the 
hoard’s existence at this one classification center, and 2S5 were sent to 
pilot training because the board’s review* of the test scores arrd interview 
data led it to believe that the cadet would make a good pilot. Follow-up 
data were* obtained for 259 of these cadets, who were test-matched with 
1 |f> cadets sent to training at a somewhat earlier date when standards 
were lower and without having been passed on by a board. Various analy¬ 
ses were made by class and time of training; in the most legitimate com¬ 
parison, (18.9 percent of the cases passed by the board failed in training, 
whereas 73 percent of those with similar stanines who went automatically 
to training failed. The critical ratio was 0.50, showing that cadets who 
were clrnically evaluated by a board of experts were no more likely to 
succeed than others who had the same stannic or psychometric index hut 
were not clinically evaluated. Despite certain defects in the design of this 
real life experiment (e.g., elimination rates were not quite the same when 
the two groups were in training, being slightly lower for and therefore 
favoring the board cases). Lcpley and Hadley seem to have definitely put 
the bin den of proof upon those who claim that the clinical method is 
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superior to a comprehensive battery of objectively validated and sum- 
mated tests. At present, one can only conclude that the rather superficial 
but costly clinical methods which have been evaluated have been proved 
no more effective than the less time-consuming objective methods. 

A Balanced Approach in Counseling 

The preceding sections have brought out the facts that use of the 
clinical method is often necessitated by the lack of data basic to the use 
of psychometric methods, and that the fully developed psychometric 
index method is not easily improved by adding clinical evaluation to it. 
It has also been made clear that both methods depend for their success 
on the use of a variety of relevant and well-understood tests. In view of 
the scarcity of psychometric occupational indices, the clinical method, 
made as objective as possible by occupational norms, must generally 
suffice as the technique of individual appraisal for vocational counseling. 

In closing this discussion, a word of caution needs to be put on record 
concerning the mechanical use of test results. The presumed superiority 
of completely validated and objectively summated test data over clinical 
interpretations docs not mean that test results should be used mechani¬ 
cally, if “mechanically” is taken to mean applied indiscriminately and 
regardless of the background of the person taking the tests, his health 
and morale at the time of testing, and the conditions of testing. Clinical 
interpretation in this sense is always necessary, and in counseling it 
should be easier than in a large-scale selection program. One illustration 
will perhaps suffice to make the point. It will be remembered from the 
discussion of test administration that Meltzer (524) reported a correlation 
between manual dexterity and output in an industrial job which changed 
from —.27 to .30 with a change in supervision. The attitudes of the 
persons taking the tests and producing the output are important. When 
such factors are involved the clinical insight of the test user is crucial. 
He cannot know whether or not such factors are present unless he has 
insight and is alert to use it. 
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USING TEST RESULTS 
IN COUNSELING 

THE interpretation of the results of psychological tests, whether by the 
counselor lor his own diagnostic purposes as discussed in the preceding 
chapter, or for the counseling of clients as considered in this chapter, 
has been strangely neglected by most authors of books or articles on the 
use of tests in guidance. In the texts of the mid-thirties there was some 
mention of problems of technique, but it is only with the focusing of 
attention on interview techniques which resulted from the work of the 
nondirective school that it has been written about in detail. In view of 
the relative recency of some of these developments and the controversy 
which still surrounds them, it seems wise to describe the techniques of 
transmitting test results to clients as they have been reported in the 
literature before attempting to suggest a method which combines the 
strengths of several. 

In a treatise such as this it is difficult to observe the distinction be 
tween test interpretation and counseling; indeed, it could be maintained 
that there is none, for test interpretation is one technique of counseling. 
Hut it is only one technique, and a very limited one, despite the fact that 
some psychologists who are more skilled in psychometrics than in coun¬ 
seling have acted as though it were the principal method of counseling. 
As a technique, it can legitimately he singled out for discussion by itscll. 
it must he remembered, however, that it can be fully understood onl\ 
within the framework of counseling in general. Chapter I has been 
devoted to a discussion of counseling; in this chapter, the focus is there ¬ 
fore as narrow as possible on text interpretation. With this caution in 
mind, we may proceed to survey methods of test interpretation. 

Directive Test Interpretation 

One of the first specific: discussions of interpreting test results to clients 
appeared in 1937 in Williamson and Darley’s Student Personnel Work 
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(ygi). After describing the types oi material included in the synthesis 
of test and other personal data, they wrote: “It is the job of the counselor 
to integrate this mateiial, to interpret the present abilities and achievc- 
ments of the case in terms of his background, and to draw conclusions 
from these interpretations. The final act of counseling the case is not 
performed by instructing the student to train for this or that particular 
profession, but by presenting to the student his possibilities in certain 
lines of endeavor: alternative goals, with the evidence for and against a 
choice, help to clarify the student’s thinking and provide needed data for 
a tentative decision. He is urged to tiy, at least, that course which seems 
to suit his abilities and interests most favorably; the tentative nature ol 
the try-out and the necessity for further interviews, beloie a final cle< i- 
sion, aic emphasized” (931:166). Again: “The retommendations upon 
which prognoses aic based must be in terms of alternatives so that the 
student may make his own choice. It is at this point in the case work that 
the counselor translates his two basic principles, about prediction for 
success in training and prediction based upon the characteristics ol goal 
groups or occupational groups, into terms that the student can under¬ 
stand in relation to his own problems” (italics in the original) (931:175). 1 
Williamson and Dailey gave no more space to test interpretation in this 
book, but this brief discussion makes it clear that they viewed the process 
as one explaining logically and in everyday language the significance of 
tests and their vocational implications to the client. 

These points were elaborated upon somewhat in later books by the 
same authors. In How to Counsel Students (9118) Williamson wrote: 
“The counselor must begin his advising at the point of the student’s 
under standing, i.e., he must begin marshaling, orally, the evidence for 
and against the student’s claimed educational or vocational choice and 
social or emotional habits, practices, and attitudes. The counselor uses 
the student’s own point of view, attitudes, and goals as a point ol refer¬ 
ence or departme. lie then lists those phases of the diagnosis which are 
favorable to that point of reference and those which are unfavorable. 
Then he balances them, or sums up the evidence for and against, and 
explains why he advises the student to shift goals, to change social habits, 
or to reLain the present ones. The counselor always tells what a relevant 
set of facts means, i.e., their implications for the student’s adjustment, in 
other words he always explains why he advises the student to do this or 

1 By permission from Studnii Prrsomirl Work, bv Williamson, E. G., and Darlev 
f G. Copyrighted 1937, McGiaw-Hill Book Co. 
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that; and he does the explaining as he orally summarizes the evidence. 
11 in this way the student’s confidence in the counselor’s integrity, friend¬ 
liness, and competence has been secured, the student should be ready to 
discuss the evidence and to work out cooperatively a plan of action” (928: 
135). 2 Although there is little mention of tests in the preceding material, 
it is clear that in a University Testing Bureau such as Williamson 
directed much of the evidence presented to the student would be in the 
lorm of test data. After a survey of other methods of counseling, and 
with only a passing reference to passive or indirect methods (as the non¬ 
directive were then called), Williamson took up the explanatory method 
in more detail: “I11 using this method the counselor gives more time to 
explaining the significance of diagnostic data and to pointing out pos¬ 
sible situations in which the student’s potentialities will prove useful. 
This is by all odds the most complete and satisfactory method of counsel¬ 
ing [italics original], but it may require many interviews. With regard 
to vocational problems the counselor explains the implications of the 
diagnosis (of test and personal data) and the probable outcome of each 
choice considered by the student. He phrases his explanation in this 
manner. 

“ ‘As far as I can tell from this evidence of aptitude, your chances of 
getting into medical school are poor; but your possibilities in business 
seem to be much more promising. These are the reasons for my conclu¬ 
sions: You ha\e clone consistently failing work in zoology and chemistry. 
You do not have the pattern of interests characteristic of succcsstul 
doctors which probably indicates that you would not find the practice ol 
medicine congenial. On the* othei hand you do have an excellent grasp 
ol mathematics, good general ability, and the interests of an accountant. 
These facts seem to me to argue for your selection of accountancy as an 
occupation. Suppose you think about these facts and my suggestion, talk 
with . . . , see . . . , and return next Tuesday at 10 o’clock to tell me 
what conclusion you have reached. I shall not attempt to influence you 
because I want you to choose an occupation congenial to you. But I do 
urge that you weigh the evidence pro and con for your choice and for 
the one I suggest’ ” (928:139-140). 

In Testing and Counseling in the High School Guidance Program 
(190: Ch. 7) Darlcy wrote in the same vein. The counseling interview, 
he stated, may be thought of as an unrehearsed play in which the coun- 

2 By permission from How to Counsel Students, by Williamson, F. G Copyrighted 
*939> McGraw-Hill Book Co. 
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selor “carries" the action; a special learning situation for the student; 
a cathartic experience for a student suffering from great emotional pres¬ 
sure; or a sales situation. In all but the cathartic type of interview Darley 
conceived of the counselor as taking the lead, for in the play he must 
“organize the conversation” and “summarize the action”; in the learning 
situation he “explains the assembled test material and non-test data to 
the student, and then follows this by a discussion of the material” (190: 
169); and in the sales situation he “attempts to sell the student certain 
ideas about himself, certain plans of action, or certain desirable changes 
in attitudes. Persuasion and logic will facilitate and hasten the sale ol 
such ideas by a counselor” (190:169). Darley continued (190:179): “Many 
books on guidance insist that the counselor must not tell the student 
what to do. While such a generalization seems unsound, since it emascu¬ 
lates most of the purpose of data collecting and since it would be of no 
assistance to a student who needs help in making a decision, it is still 
true that the student who chooses one from among several suggested plans 
of action will feel a more active participation in planning with the 
counselor.” 3 

Experience and the contributions of others may have led both William¬ 
son and Dailey to modify their viewpoints since the above texts wcie 
written, for much progress has recently been made in the clarification 
of counseling methods, but these writings have influenced and are in¬ 
fluencing many users of tests and many counselors. For this reason it is 
necessary to present them in some detail. Perhaps the best way to see 
the limitations of this method is to present the antithetical point of view, 
and then to attempt a synthesis. 

Nondirectwe Test Interpretation 

Most active and severest critics of the directive approach of William 
son, Darley, and many other vocational psychologists and counselors 
have been the nondirective counselors, led by Rogers (640,641). His most 
detailed discussion of the use of tests in nondirective counseling (640) 
first points out that tests do not stand up well as a “client-centered” 
counseling technique, because in suggesting tests the counselor implies 
that he knows what to do about the client’s problem, in administering 
them routinely or early in a contact he proclaims that he can find out all 
about the client and tell him what to do, and in interpreting them he 

3 By permission from Testing and Counseling in the High School Guidance Program, 
by Darley, J. G. Copyrighted 1943 , Science Research Associates. 
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poses as an expert who knows all the answers and will impart them to 
the client. “By every criterion, then, psychometric tests which are initi¬ 
ated by the counselor are a hindrance to a counseling process whose 
purpose is to release growth forces. They tend to increase defensiveness 
on the part of the client, to lessen acceptance of self, to decrease his sense 
of responsibility, to create an attitude of dependence upon the expert" 
(italics are mine) (640:141). As one might expect from the italicized 
clause, however, Rogers went on to point out that there are stages in 
counseling at which clients are emotionally ready to study their abilities 
and interests and to compare them with those of others as a part of the 
formulation of objectives and the making of plans. Rogers believes that 
this docs not occur frequently in practice, and that it is not the factual 
test results which are important, but rather the attitudes of the client 
toward them. He therefore sees little place for tests in practice while 
admitting that there is one in theory. 

Rogers’ views may have been conditioned unduly by selective experi¬ 
ence. His theories were formulated while working in a child guidance 
clinic. After that his work in a university counseling center undoubtedly 
brought him cases in which emotional problems were much more com¬ 
mon and more serious than they are in cases going to the vocational 
counseling center of the same university. It is a commonplace that people 
are referred to and gravitate toward persons who are interested in their 
types of problems: psychoanalysts encounter sex problems, ministers 
religious problems, attitude (nondirective) counselors attitudinal prob¬ 
lems. It is significant that those of Rogers’ students who have worked in 
centers which specialized more in vocational and educational counseling 
have found their viewpoints modified by that experience. It is from them 
that some very helpful formulations of the role of tests in counseling and 
<>( the methods of interpreting test results to clients have come. Contribu¬ 
tions from the Bixlers, Combs, and Covner are cited below. 

The use of nondirective techniques in vocational counseling was con¬ 
sidered by Covner (173) in a review of his experience in a vocational 
counseling center. Concerning tests he wrote: “A second major locus of 
fruitful application of the nondirective approach was the area of prepara¬ 
tion for testing and interpretation of test results. . . . Test interpretation 
called for all the skill the counselor could muster. ... As an introduc 
tion to interpretation it was frequently found helpful to sound out a 
dient on his reactions to the tests. His mode of response served as a guide 
and warning to the counselor as to what sort of session test interpretation 
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would be. For example, when a client who did very poorly on certain tests 
reported that he ‘knocked them for a loop/ the counselor took notice 
to proceed with caution. The same approach on a number of occasions 
showed that clients were able to do a remarkably accurate job of inter¬ 
preting their relative strengths and weaknesses, and to reveal considerable 
understanding of themselves” (173:71-72). Covncr goes on to point out 
that rejection of the counselor’s interpretations often seemed to be the 
result of iailure to give the client sufficient time to react, and that ex¬ 
ploration of client reactions was facilitated by reflecting feelings, as in 
the statement, ‘‘The results are rather disappointing to you.” To the 
experienced vocational counselor who has not been unduly influenced 
by highly dircctixe waitings such as those of Williamson and Dailey, 
these insights into test interpretation do not seem very surprising: such 
nondiiec tive tc( Uniques have been the stock-in-trade of good \ocational 
counselors since the origin of modern vocational guidance, but because 
of gieater interest in occupational information and in the counselors 
use of tests, they simply have not been written up. 

Types of problems best handled by directive and In nondirecti\c 
techniques have been analyzed by Combs (if>(>) who worked in a univer¬ 
sity counseling center in which educational and \ocational problems 
outnumbered emotional adjustment cases by three to one (322). One* 
type of case best handled nondirectively was that in which the level of 
aspitation is de finitely higher than demonsti ated abilit\ 01 in which there 
is a wide discrepancy between expressed and measured interests. Combs 
points out that such discrepancies are warning signals, and that the* 
emotion likely to be aroused by being brought face to face* with them 
is best handled nondirectively. He docs not elaborate on method in this 
connection, but it consists primarily of accepting the client’s feelings and 
of reflecting them in such a w r ay as to make it possible for him to dis¬ 
charge the emotion, to accept himself, and then to discuss the situation 
and its implications objectively. This subject has been most adequately 
covered by the Bixlcrs, whose contribution is discussed below'. 

The Bixlcrs served as counselors in the Student Counseling Bureau 
(formerly the University Testing Service) of the University of Minnesota, 
in which the test-oriented philosophy of Williamson and I)arW tended 
to prevail; they therefore made a point of stitching the use of nondirec¬ 
tive techniques in vocational test interpretation (97,98) which, strangely 
enough in nondirective counselors, they seem to consider synonymous 
with vocational counseling. They begin by pointing out that there are 
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two aspects to the ptoblenis of test interpretation: j) presenting test 
Jesuits to the client in such a way that they are understood, and, u) deal¬ 
ing with the client in such a way as to facilitate his use of the informa¬ 
tion. Although they do not so state, it seems to the writer that Williamson 
and Darley focused on the former and did it rather well, except that, 
as implied in Govner’s report, they may not have allowed the client to 
react to the presented facts enough to guarantee an understanding of 
them. 'I hey appear to ha\e failed to deal with the client in such a way 
as to ensme his being able to use the facts, depending entirely cm his 
being sufficiently well adjusted emotionally to assimilate a mass of 
personal and therefore emotionally toned information. As Rogers 
pointed out, this is sometimes the case, perhaps more often than Rogers 
tecogni/ed, but it is certainly not always so. As the Bixlers put it: “The 
giacling of examinations at the* end of the quarter verifies the ineffective¬ 
ness of books and lectines in giving infonnation to students. Vocational 
test interpretation is much more personali/cd and there is greater op¬ 
portunity and reason for the student to distort or disregard infonnation 
gi\en to him” (98:1 q7). How many innocent counselors have not been 
shocked when clients or former clients reported that “You told me my 
tests showed that 1 would make a good personnel manager . . .” merely 
on the* basis of one Strong’s Blank score? The rules evolved by the Bixlers 
loi interpreting vocational tests to clients are given below. 

1. G ivc the client simple statistical predictions based upon test data. 
Examples: “Eighty out of 100 students with scores like yours on this test 
succeed in agriculture.” “You have more of this tvpe of aptitude than Or, 
out of 100 successful accountants.” This can of course be elaborated 
upon. 

2. Allow the client to evaluate the prediction as it applies to himself. 
After merely stating the lacts the counselor pauses, perhaps longer than 
he feels he should, in order to let the client react to the facts. 

3. Remain neutral towards test data and the client's reactions. The 
counselor expresses no opinions, gives no advice, but in a warm and 
respectful manner listens to what the client has to say. This is called 
acceptance: it is not the same as agreement. 

T Facilitate the client's self-evaluation and subsequent decisions. The 
counselor recognizes and reflects the feelings and attitudes of the client 
Example: “You expected this, but it’s hard to take.” This makes it easier 
for the client to explore his feelings further, to release any related ten¬ 
sions, and to view the data and their implications more objectively. 
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5. Avoid persuasive methods. The counselor need provide no artificial 
motivation: the test data and the exploration and release of related feel¬ 
ings should do that. If they do not, neither will the exhortations or 
cajolements of the counselor. 

Some sample excerpts from cases are given by the Bixlers (98:151-152): 
one of these is reproduced and commented on below, in order further to 
illustrate the technique. 

Ci. There are studies which demonstrate that students’ ranks in high school, 
along with the way in which they compare with other entering students in 
mathematics, are the best indication of how well they will succeed in engineer¬ 
ing. Sixty out of one hundred students with scores like yours succeed in en¬ 
gineering. About eighty out of one hundred succeed in the social sciences (names 
several). The difference is due to the fact that study shows the college aptitude 
test to be important in social sciences, along with high-school work, instead of 
mathematics. 

51. But I want to go into engineering. I think I’d be happier there. Isn't 
that important? 

C2. You are disappointed with the way the test came out, but you wonder if 
your liking engineering better isn’t pretty important? 

52. Yes, but the tests say I would do better in sociology or something like 
that. (Disgusted.) 

C3. That disappoints you, because it’s the sort of thing you don’t like. 

53. Yes, I took an interest test, didn’t 1? (C nods.) What about it? 

C4. You wonder if it doesn’t agree with die way you feel. The test shows that 
most people with your interests enjoy engineering and are not likely to enjoy 
social sciences— 

54. (Interrupts.) But the chances are against me in engineering, aren’t they? 

C5. It seems pretty hopeless to be interested in engineering under these con¬ 
ditions, and yet you’re not quite sure. 

55. No, that’s right. I wonder if I might not do better in the thing I like— 
Maybe my chances arc best in engineering anyway. I’ve been told how tough 
college is, and I’ve been afraid of it. The tests are encouraging. There isn’t much 
difference after all—being scared makes me overdo the difference. (He decided 
to go into engineering and seemed at ease with his decision.) 

Several good features of this interview illustrate points made by the 
authors, and are worth pointing out, together with some defects in the 
use of the technique. The fust statement of the counselor (Ci) is a factual 
statement of an actuarial sort, without explicit personal applications of 
evaluations. In these respects it is good. It fails, however, to achieve 
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simplicity and clarity. What kind of score or scores are referred to in 
the “sixty out of one hundred” sentence: high school rank, mathematics 
grades, or mathematics achievement test scores? II the last is meant, is 
that the score which predicts success in eighty out of one hundred stu¬ 
dents in the social sciences? The last sentence in the paragraph suggests 
that a scholastic aptitude test is the predictor for social sciences, a mathe¬ 
matics test that for engineering. This may have been made clear in un- 
teported passages preceding this one, but as it stands the paragraph has 
to be carefully analyzed in order to be understood. It need be no longer 
to be clear by itself. 

The client’s first response (Si) illustrates the value of the method in 
obtaining free expressions of the client’s feelings on the matter, and the 
counselor replies (C2) by recognizing and reflecting the feeling. This 
causes the client to bring up the counter argument himself (S2), putting 
him lather than the counselor in the position of the weigher of evidence; 
the client is taking responsibility for working out a solution. The coun¬ 
seled helps him continue the thought process by again reflecting feeling 

The client pursues the matter further and it occurs to him that 
another test he took might throw some light on the matter (S3). This 
natuial introduction ol test results and discussion of them as one mote 
bit of evidence is one of the \ery real stlengths of this technique: it 
should be noted that the client is putting the test data to use with the 
help of the counselor, rather than the counselor telling him what thcii 
implications are. A follow-up of such cases would probably show much 
less distortion of test data and of counseling by the client, for the client 
has clone his own thinking and reached his own conclusions in the 
presence and with the help of the counselor, rather than after he left the 
counselor’s office. 

The counselor then reports the relevant test results (C.j), reflecting 
feeling in the process in order to help the client clarify his thinking. The 
client continues to keep control of the diagnostic process, interrupting 
(S.j) to make a tentative interpretation for the counselor to check. The 
counselor reflects feeling (C5), rather than repeating statistics. The re¬ 
sult is a summary weighing of evidence by the client, in which he reaches 
a conclusion based on an understanding and acceptance of his own 
limitations and an awareness of the assets which he may draw on to help 
carry out his plans. 
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,1 Synthesis of Suggestions for Interpreting Test Results to Clients 

The preceding sections have made it clear that the nondircctivists who 
have worked in vocational counseling have made a significant contribu¬ 
tion to the literature on test interpretation. Their insistence on analyzing 
counseling experiences and techniques has led them to formulate prin¬ 
ciples and to describe the use of methods of test interpretation which 
have long been in use by many counselors. In verbalizing what is done 
they have crystallized thinking on the subject and thereby helped to 
improve practice. As they approach the problem from the point of view 
of a systematic school of thought they have made some unique contribu¬ 
tions in pointing out the implications of \aiious interpretive techniques 
for counseling; they also run the risk, as shown in Rogers' paper on the 
subject, of failing to see other implications and other possibilities be¬ 
cause of theoietically produced blindspots. There are sallies also in some 
of the mote directive procedures, and occasions when they ate more 
effective. For these teasons the writer prefers a more eclectic approach. If 
the suggestions outlined in the paragraphs which follow appear more 
nondirective than directive, that is because the writer’s philosophy and 
and approach, like those of Dewey, Kilpatrick, Kitson, Bievvei, Taft, 
Allen, Roethlisbergcr, Cantor, and Rogers, are client-centered and non¬ 
directive, even though not Client-Centered and Nondirective (that is, not 
those of a “school” of counseling). The writer’s philosophy is also prag¬ 
matic, in that he is willing to use whatever works, and does not feel 
compelled to use only the techniques which are compatible with a system, 
valuable though systems are as means of making cane conscious of the 
implications of the procedures used. 

Structuring the Counseling Relationship. It may seem odd to begin 
a discussion of methods of test interpretation with a consideration of 
the structuring of the counseling relationship, but experience has re¬ 
peatedly shown that the client’s attitude toward test results is an impor¬ 
tant factor in his first contacts with the counselor, and that what happens 
in these first contacts nrakes it easy or difficult lor the counselor to use 
the test results constructively in counseling. As Bordin and Bixler (113), 
Covncr (173), and others have put it, most new clients feel that their 
problems will be solved by the counselor and that tests will play a major 
part in the process, and many arc confused as to what vocational counsel¬ 
ing is like. One prerequisite to good test interpretation, then, is the 
establishing of an appropriate mental set in the client; this is generally 
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referred to as structuring the relationship. T he techniques arc partly 
verbal, partly nonverbal. 

Verbal structuring may be done by asking the client what kind of help 
he wants the counselor to give him and, if (as often happens) the reply 
is, “Give me some tests to tell me what I will succeed in,” by replying, 
“You feel that tests will solve your problem for you.” Most clients tcact 
negatively to this type of bald but accepting statement by another of 
what they have actually been thinking; it brings to the surface the realiza¬ 
tion that they must assume mote of the responsibility for their actions, 
and that tests are not likely to provide any such clear-cut answers. For 
the client to formulate and express these ideas himself is much inoie 
effective than for the counselor to do so lor him: the former constitutes 
the achievement of insight, while the- latter may be no more than in- 
docti ination. Verbal structuring is also accomplished by an explanation 
of the counseling procedure, to make it cleai that it consists of two 
persons, one of whom is trained in counseling and in occupational in¬ 
formation, discussing the other person’s aspirations, status, abilities, 
interests, and plans, surveying relevant facts, and considering their im¬ 
plications. It brings out the fact that testing is one way of getting some 
types of data, that there are other types of data and other ways of obtain¬ 
ing them, and that discussion is the crucial process. 

Xoiii’n bal .shuc taring is based on the old adage that actions speak 
louder than words. If the counselor creates a permissive situation, acts 
as though he weie interested in the client, and “accepts” the client’s 
expressions of feeling, the client will generally sense that discussion is 
the essence of the vocational counseling, and that his own participation 
is the essence of the discussion; he will usually welcome the opportunity 
to make a genuine exploration ol his vocational aspirations and status 
and will assume responsibility for working actively with the counselor 
in this enterprise. Once this type of relationship is established, the 
counselor need have no leat that the use of tests will unintentionally 
result in the imposition of a vocational prescription on the client. 

Test Administration and Interpretation. In the chapter on test 
administration attention was devoted exclusively to the problems and 
techniques of administering tests to individuals and to groups. But what 
has been said about interpreting test results to clients has made it clear 
that it has broader implications, for the way in which testing is done has 
an important effect on the client’s expectations of tests. As Rogers (b.jo) 
pointed out, the routine administration of tests or the giving of tests 
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early in the counseling process, i.e., in an unstructured relationship, 
implies both that the counselor knows what to do about the problem 
and that he can find out what he and the client need to know by means 
of tests. Such test administration is, in fact, a nonverbal or behavioral 
structuring of the situation. The antidote is not necessarily, as Rogers 
implies, to refrain from testing early in the relationship; it may consist 
of so structuring the relationship by discussion and behavior as to make 
it possible to test without creating this undesirable mental set. As clients 
often come with it already partly established, to do so requires special 
attention to the problem and a degree of skill, but it can be done. The 
essential factor is that testing be done by mutual agreement for jointly 
established purposes. Ways of accomplishing this are considered in the 
paragraphs which follow'. 

Routine test administration is often administratively desirable. In 
schools and colleges it is most economical to give tests to entering classes, 
in order to ha\c the data for sectioning, screening, diagnostic, and coun¬ 
seling purposes when and as needed. In guidance centers it simplifies 
scheduling, cuts down expenses, and is a safeguard against failure to 
obtain and consider basic objective data such as intelligence and interest 
test results. The question then is, can these values be preserved, without 
< rcating the mental set criticized by the nondirectivists and most other 
counselors? The writer is not familiar with any systematic experimenta¬ 
tion w r ith this problem, but experience and observation suggest that it 
may be done by two methods, one applicable to academic testing pro¬ 
grams and one to guidance centers. 

In school or college testing programs a large number of the students 
taking tests do so as a routine matter, because they are asked to do so 
rather than because they w'ant to take them. Others are more immedi¬ 
ately interested because they have problems of curricular or vocational 
choice to the solution of which they believe tests will contribute. Both 
of these groups can be helped by a brief explanation of the fact that the 
testing program is part ol the institution’s method of obtaining informa- 
lion wdiicli may be useful in problems of choice and adjustment, and that 
the test-obtained data are just one part of the information secured over 
the years, and added to the student’s record. It is stressed that the other 
data, such as the previous school record, grades, extracurricular activities, 
part-time and summer w'ork experience, and the student’s own feelings 
on these matters of vocational choice and educational adjustment, are 
of central importance, test data being just one kind of helpful informa- 
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tion. The procedure is like a routine medical examination: it may not 
turn up anything of special significance, but on the other hand some 
items may help to give a better understanding of a situation. This type 
of explanation will not uproot any strongly imbedded ideas that tests 
give all the answers, but it will help prevent their taking root and may 
help pave the way for more individualized structuring when a student 
comes for counseling. 

In guidance renters to which individuals come for help with problems 
of vocational and educational adjustment test administration is always 
preceded by some sort of intake or registration interview. This can be 
and too often is handled as nothing more than a registration procedure, 
in which the basic data concerning the client are obtained, the presented 
problem is ascertained, the type of test batter)' to be given is determined, 
and an appointment is made for testing. But it can also be made an 
occasion in which the client finds an opportunity to discuss his problem 
in a pei missive atmosphere and to develop an often-needed orientation 
both to his situation and to the kind of help which can be given by a 
guidance center. If the intake or preliminary interviewer (whether or not 
he is the final counselor) is nondirective at first and permits the client 
freely to explore his problem he will generally get better insight into its 
nature and into the kind of testing and counseling which is needed than 
if he proceeds at once systematically to take a history; he will establish a 
relationship in which the client is an active participant; and with this 
as a foundation he can help the client to understand that the inform,ition 
asked for in taking the history and the data sought in administering tests 
are simply part of the background material which the counselor uses in 
getting an orientation to him as a person. It can also be stated that some 
of the background data, such as the tvpe of education received, grades 
earned, jobs held, and scores made on tests, may at various points in the 
counseling be facts to which the client and counselor will want to refer 
and which they may want to discuss. Testing may then occupy the same 
position in the counseling sequence as it now does in most guidance 
centers, and it may stand out as such administratively, but the client no 
longer views it as the procedure which gives the counselor and himself 
the answers they seek; instead, he sees it as simply one more data-collect- 
ing device, and he begins to understand that data collecting is only one 
small part of the counseling procedure. It then devolves upon the coun¬ 
selor to establish a permissive relationship with the client, carrying on 
that begun in the early part of the intake interview. 
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'This is done by keeping the focus on and the responsibility in the 
client. The counselor may begin the first interview, for example, with a 
statement that “Mr. Doe, the first interviewer, has told me something 
about you, and of course I have looked over the records, but I think it 
would be most helpful if you would fell me, in your own words, just what 
you have been thinking about and with what you think we might help 
you.” The new relationship thus begins as one in which the client is 
active rather than passive, in which the counselor can accept and reflect 
feelings, and in which the client uses the superioi knowledge and insights 
of the counselor to develop his own understandings. 

After routine testing and the 1 e-establishment of a permissive relation¬ 
ship, test interpretation may be done at various points during the inter¬ 
views. When the client wants to evaluate himself in comparison with 
other students or occupational groups, and asks how he stood on some 
test, the counselor reports the results in actuarial, nonc\aluati\c terms in 
the manner previously described, permits the client to react to the facts, 
reflects his feelings, and facilitates further scIl-c\a!uation as the client 
continues to explore the significance of the facts lor himself. This repott¬ 
ing of test results is often scattered throughout a series ol interviews, the 
data being introduced only as they are relevant and tecjnested by the 
client. More often in practice, but perhaps less desirabh, the* reporting of 
test results is clone in one session, in which the counseloi gives the client 
a profile of his test results to help him visualize them while he* explains 
their actuarial significance. The results should be expressed in petcentiles 
(without I. Q.’s!) so that relative standing will be easilv understood by 
the client, and the nature of each group with which a comparison is 
made must be explained as briefly recorded on the test ptofile. Having 
the data in ft out of him permits the client to take 1 in both his standing 
on each test and the nature of the comparison. This process in unfamiliai 
to the client and therefore requires much more of a mental adjustment 
than the counselor, used to test reports, generally reali/es. It makes it 
possible for the counselor either to complete his explanation of the whole 
battery, allowing the client to come back to and discuss each score, or to 
stop after each test score has been briefly explained and allow for as 
thorough an exploration of that datum as the client wants to undertake*. 
The writer is not ready to recommend one procechue rather than the* 
other, but suspects that whichever method the client wants to use is best. 
If so, the effective counselor will pause long enough and be permissive 
enough after each brief explanation for the clie nt to be able to take the 
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initiative any time he is ready to use it. It is, alter all, the client who 
must use the test results, and as their use is an emotionally loaded process 
it is well to let it be nondirective. 

Client-determined test administration is perhaps the best term for 
what Rogcrians would call client-centered testing, for the type of routine 
testing which has already been described is also client-centered in that 
tests are selected on the basis of relevance to the client and results are 
used as lie needs them to clarity his thinking. This procedure has been 
described by Borclin and Bixler (i 13) in a way which evokes considerable 
criticism hour some counselors, as they proposed that the client choose 
his tests himself w’ith little or no help from the counselor. It is not neces 
saiy, however, to go to this extreme in 01 del for test selection and 
administration to be client-determined. 11 counseling has been begun 
nondirectively and the relationship is one in which the client works on 
problems ol his own choice at his own speed, he may, to quote Rogers 
(b.jo: 1.42), “reach a point where, lacing his situation squarely and real¬ 
istically, he wishes to compare his aptitudes or abilities with those ol 
others for a specific purpose. Having formulated some clear goals, he nun 
wish to appraise hb own abilities in music, or his aptitude lor a medical 
course, or his general intellectual level.” When such desires are expressed 
the counselor may give* the test or tests himself, giving the client the 
resulting information in the way alreadv described. Better still in sonic 
cases, the counselor lets the client obtain the information himself b\ 
wen king up the percentiles and test profile together, thus continuing tin 
mutual processes and joint activity of the counseling relationship, the 
client reacting to test data as they are obtained and the counselor ex¬ 
plaining actuarial aspects and reflecting the client’s feelings. 

Combs (1 (>6:266) believes that the difficulties in the way of the coun¬ 
selor who attempts to provide information and clarify attitudes are so 
great as to make reliance on a third person as test administrator and 
interpreter desirable, with the client then lemming to the original 
counselor for nondirective clarification of feelings and reaching of con¬ 
clusions. As he points out, it takes a good deal of skill to shift from one* 
relationship (“directive” supplier of lacts) to the other (nondirective 
acceptor and reflector of feeling), but then so does any aspect of counsel¬ 
ing, and this writer believes that it is a technique which can be learned 
and used like any other. Whether or not it is learned and used depends 
upon the counselor’s theoretical orientation, personal preferences, and 
work situation. In Combs’ case all three were in favor of not having one 
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counselor shift back and forth from more to less directive techniques; 
in most cases the requirements of the work situation and the desirability 
of continuity of relationships and work lead the writer to believe that 
skill in the use of both techniques and in shifting from one to the other 
is desirable. 

77 /e Counselor's Moral Responsibilities: Breaking Bad News and Shar¬ 
ing Good. As a user of psychological tests and as a diagnostician of 
vocational aptitudes and interests the counselor has available informa¬ 
tion which may be of crucial importance to the client and of value to 
society. But neither the individual nor society may be aware of the 
availability and significance of that information; the client may never 
ask for it, and the counselor may never seek to obtain or to share it, il 
strictly nondirective procedures arc used. One might ask whether it is 
ethical for a counselor to let a high school student work through his 
attitudes toward going to college and to make college plans without 
checking up on his mental equipment for going to college. Does a coun¬ 
selor who knows that a young man who is planning to entei a skilled 
trade actually has abilities and interests which might make him a sci¬ 
entist of considerable stature owe it to his client to make him aware of 
that fact? And does he owe it to society? It is not only attitudes which 
make for success and satisfaction: abilities, opportunities, and awareness 
also play a part. The counselor has an obligation beyond that of assisting 
the client to assume responsibility for his own action, although that 
seems to be the sole objective of counseling set up by the nondirective 
school. He also has responsibilities for the detection and optimum use 
of talent, and for helping some clients to achieve insights into them¬ 
selves and into society which they might not develop in sufficient time 
if left to direct the entire course of counseling themselves. 

The counselor or test interpreter therefore needs to make sure that 
certain facts basic to the solution of the problems being worked on by 
the client are secured and considered. Intelligence tests may not be asked 
for in considering the choice of college or of occupational level: in such 
a case the counselor must either be sure from other evidence that the 
client has the ability to implement his plan or help the client see the 
need for the obtaining and considering of such evidence. Interest inven¬ 
tories may not be requested or interest scores discussed in considering 
the choice of a field of work; but if the counselor does not see good evi¬ 
dence in the cumulative record or case material that the field being 
considered is compatible with the client’s interests, he owes it to the 
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client and to society to lead the client to want to obtain such evidence. 
Psychotherapy cures some seemingly physical illnesses and solves some 
seemingly vocational problems, but it leaves other body ailments and 
other vocational adjustment problems untouched: not to make a diag 
nosis when diagnosis may be important is as potentially serious an omis¬ 
sion in a counselor as in a physician. The counselor or psychologist need 
be no more apologetic about being directive in such instances than is the 
physician. The crucial question has to do with how the counselor brings 
such evidence to the attention of the client. This is primarily a problem 
ot counseling rather than of test interpretation, but as the solution is 
sometimes sought in test interpretation it should be briefly considered 
here. 

To put it in negative terms, a client should not be confronted with 
an unsuspected low intelligence test score, low musical aptitude scores, 
or an unfavorable personality inventory score. The counselor must 
instead lead the interview into channels which help the client to explore 
these characteristics. This may be done by getting him to talk about his 
school grades, his success as a member of a glee club or band, or his rela¬ 
tions with fellow-students or fellow-workers. Discussion of any of these 
matters in a permissixe atmosphere usually leads the client to examine 
his aspirations and his disappointments, his strengths and weaknesses 
(509,627). Reflection of the related feelings encourages the pursuit of 
these topics. During the course of such discussions it is generally easy 
enough for the counselor to introduce relevant objective data with which 
the client should be familiar; in fact, the client will often ask for them 
before the counselor has to take the initiative. From then on the process 
is strictly one of counseling, and beyond the scope of this book. 



CHAPTER XXII 


PREPARING REPORTS 
OE 1EST RESULTS 

WRITTEN reports of the results of psychological tests are generally 
prepared lor one or more of the following reasons: i) to piovide a per¬ 
manent lecord ol the interpretations made h\ the person who counseled 
the client; 2) to provide an interpretation of the results for the use ol 
other professional workers; to insure that the user of the test results 
makes a thorough analysis of his data rather than relying on cliches or 
stereotypes; and, j) to provide clients or (heir parents with a record of 
the interptetations for futuie reference. The first three reasons pertain 
to the same type ol write-up, which may he lclerred to as the report to 
professional walkers: the last may be called the report to clients. Each 
of these is taken up in turn in this chapter, from the point of view of 
purpose, form, and content. 

Reports to Professional Workeis 

Depending upon the situation in which he is working, the psychologist 
who administers tests and reports on test results does so in one of three 
ways. He may simply submit a graphic profile of test si ores to a coun¬ 
selor or personnel worker in the same oigani/ation, accompanied by 
notes on observations. He may make limited interpretations, working 
primarily from the test results, when testing for a colleague to use in 
working with the client, this user being a counselor, psychiatrist, social 
or personnel worker. Or he may draw on all the case material in making 
a full interpretation , avoiding dependence on the ability of the user to 
synthesize the test findings with case history material. 

Profiles of Test Results. The most effective way of presenting test 
results in some guidance centers and business organizations has been 
found to be the test profile or psychograph. This is true when testing 
is done by psychometrists who have no more skill in interpreting test 
results than the counselors or executives who use them, and when the 
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latter have been so well trained in test interpretation that it is uneco¬ 
nomical for the psychologist to write out detailed interpretations which 
the counselor or personnel worker can make himself. In the latter'situa¬ 
tion the psychologist and the user of the test results generally find that 
a brief discussion is all the profile ever needs by way of supplementation, 
or that a few notes at the bottom of the sheet on the client’s behavior 
during testing take care of the subjective factors which the counselor 
should consider. 

The objective of the test profile is, then, to set forth the test results in 
the simplest and clearest form, so that a tiained and experienced user 
o[ test results, who can confe r with the examiner in case he has cjuestions, 
can cjuickly grasp their significance. It is also sometimes a useful device 
1 or study with a client, serving as a basis for discussion in which the 
client develops insights b\ analyzing both the data and his reactions to 
them. He is aided by the counselor’s interpretations of the actuarial 
significance ol the tests and 1 collections of feeling. 

The prnuiples which govern the development of test profiles and the 
graphic representation of standing on tests can be outlined as follows: 
i ) tests should be gi on peel according to type of aptitude or 11 ait meas¬ 
ured; u) when standaid battelies are used the test names should be 
piinted on the piofile sheet to the left of the grid, but when the test 4 
used van greatly with the client blank spaces should be left in which 
test names may be entered; 3) space should be provided after the name 
of each test foi entering data concerning the norm group; .j) anothei 
space should be allowed for recording the test score or peicentile; 5) the 
grid or graph on wliic h the test results are plotted may be based on either 
percentiles or standard scores, or may show both, but the users should 
be conscious of the advantages and disadvantages of both types of scoies; 
(») some test data arc not appropriately represented on the grid together 
with aptitude and achievement tests and need a special type of presen¬ 
tation on the test piofile; 7) space should be provided lor supplementary 
personal data to aid in interpretation; 8) it may be desirable to record 
observations made during the test sessions to aid in understanding some 
of the objective scores. Each of these principles is taken up in more 
detail below. 

1) The grouping of tests according to type of aptitude or trait meas¬ 
ured is primarily to facilitate the comparison of test scores which should 
be approximately the same. Although the Minnesota Spatial Relations 
Test and the Minnesota Paper Form Board measure the same basic 
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aptitude, the latter test is more aflected by general intelligence than the 
former; for this reason a client’s status will not be identical on the two 
tests, but a study of the diflerenccs often helps give a better understand¬ 
ing of the person tested. This type of analysis is aided by juxtaposition 
ot the scores. In the case of interest inventory scores, which do not lend 
themsehes well to plotting on the grid, parallel listing of the scoies ol 
the commonly used inventories is helpful in making comparisons. Three 
profile forms 1 (’produced in this chapter (Figures 9, 10, 1 1, and 12) 
illustrate this principle. On the form used in the writei’s course in 
vocational testing (Figs. 9-10) the aptitude tests are grouped on one page 
by the side of the grid in the following sequence: scholastic aptitudes 
(3 tests), vocabulary (8 subtests), scientific (f> subtests standardized on 
technical groups), clerical (2 subtests), manual ((> tests or subtests be¬ 
ginning with gross-manual and graded to fine-finger dexterity), spatial 
relations (j tests), and mechanical (7 subtests and 3 tests). The classi¬ 
fication of tests is not strictly according to traits measured, but also 
takes into account the nature of the occupations for which the tests 
have been standardized, a compromise with the imperfections of test 
construction. In the Differential Aptitude Tests (Fig. 11) this difliculty 
is overcome, along with others noted below, by the practice of developing 
one co-ordinated battery of tests rather than using, as one so often must, 
a variety of tests from diflcient sources. 

2) The names printed alongside the grid save time and impro\e the 
appearance and readability of the profile if the clients worked with are 
sufliciently homogeneous in status and objccti\es for a standard list of 
tests to be appropriate, selections being normally made from within 
this list. This is true in selection programs in which standard batteries 
are used, and in specialized guidance centers. The Vocational Advisory 
Service profile (Fig. 12) attempts to effect something of a compromise by 
listing traits measured rather than the test doing the measuring. This 
has the advantage of focusing on psychological characteristics, but makes 
it necessaiy to write in the name of the test except when users of the 
report know that a certain test is routinely used for each trait, as in the 
case of the spatial and dexterity tests. Figure 13 shows a form on which 
no test names are specified, because of the variety of tests used by the 
agency; it provides space in which to make note of the names. Such 
flexibility of forms is essential in such an organization. 

3) Spaces for the entering of data concerning the norm group are 
essential because most tests have several sets of norms from which the 
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examiner selects the most appropriate. Without these notations the coun¬ 
selor cannot know the significance of the client’s standing, as in the case 
of the Minnesota Clerical Test, on which standings when compared to 
accountants are radically different from standings when compared to 

VOCATIONAL INTERESTS 

Occupational Level - M-F - Interest Maturity - 


A-V Kuder Strong 

I. Scientific - - 

A. Biological - 

a. Physician - 
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б . Artist - - - 
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ihe general population or even to general clerical workers. These nota¬ 
tions can be brief, as in Figure 9, for the counselor should know the tests 
well enough to remember the details if practice calls for no written 
interpretation by the examiner. 

4) Space for recording the standard score or percentile obtained when 
the examinee is compared with the norm group is necessary, both as 
an aid to plotting the graph and as an aid to using it, for minor errors 
in plotting and reading graphs are common. The numerical entry 
permits accuracy, just as the graphic entry facilitates grasp of relation¬ 
ships. 

r,) Grids based on percentiles have the advantage of using the familiar 
and readily understood form of expressing standing on a test in relation 
to other persons, whereas standard scores are much less commonly known 
in non-psychologically trained circles. But the percentile system has 
the disadvantage of distorting scores at the extremes, thereby minimizing 
the important differences in aptitude, while standard scores accurately 
express these same differences. For example, I. Q.’s of 135 and 190 are 
both expressed as the 99th percentile, despite the fact that the latter is 
much further from the mean than the former, making two persons with 
those scores seem equally intelligent instead of quite different from each 
other. As standard scores are based on distances from the mean, this less- 
known system reveals instead of hiding this difference. However, as it 
is probably easier to explain this fact to test users than to get them to 
adopt standard scores, it is probably wiser in practice to use the percentile 
system and keep its delects in mind. When space permits it is therefore 
wise to provide also for the recording of I. Q.’s (Fig. 9) and standard 
scores beside the .grid. 

6) 7 Tst data which do not lend themselves to presentation on the 
giid include the results of some interest inventories and some person¬ 
ality measures. Scores on Strong’s Vocational Interest Blank, being 
measures of similarity of interests to those of men in various occupations, 
do not have the same meaning at the higher extremes as do aptitude 
tests. As Strong has pointed out (775:67), there may be no real difference 
in clerical interests expressed by standard scores of 55 and 65: both 
persons have interests like those of clerical workers, and the former may 
perhaps be as representative of clerical workers as the latter. Strong 
therefore rightly recommended use of letter ratings, and these do not 
lend themselves to plotting on the more refined continuum of percentiles. 
Figure 7 illustrates the profile form used by Strong, combining letter 
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grades and standard scores. Another effective way of organizing such 
data is shown on the obverse side of another psychometric report form 
(Fig. 10), on which the types of interests measured by the Strong, Ruder 
and Allport-Vernon inventories are roughly equated and grouped accord¬ 
ing to types. It then becomes possible quickly to scan the entries in order 
to see in which occupational families high ratings tend to predominate. 
This has the advantage, also, of emphasizing the difference between 
aptitude and interests, frequently forgotten by clients and by relatively 
untrained counselors. 

7) Space for supplementary personal data is something of a safeguard 
against interpreting test results in a vacuum. Most forms call for age 
and sex at the top of the first page, where they are seen befoie any test 
scores. The obveise side of one form (Fig. 10) provides space for the most 
important educational, avocational, occupational, and aspirational facts 
concerning the client. These make it possible for the user of test results 
to check quickly the client’s measured interests against his expressed in¬ 
terests and ambitions, and to ascertain whether or not his aptitudes ate 
reflected in experiences appiopiiate to them. The data are too sketchy 
lor complete diagnostic woik, but help in case of quick reviews. 

8) The recording of the observed behavior of the client often cannot 
and need not be a detailed and tedious task, and thereloie genet alls 
does not require much space. It is important that some evaluative com¬ 
ments can he made on the test profile, however, in special cases. Figuies 
12 and iq reproduce forms in which space is provided for such notations. 
These are especially desirable in the case of apparatus tests which permit 
the subjective analysis of the client’s approach to problems. 

Limited Interpretation . This is the type of report of lest lesults which 
should be prepared and used routinely in guidance centers in which 
psychometric work is clone In psychologists, and in which counseling 
is carried can by vocational counselors who have more knowledge of 
occupational requirements and of counseling techniques but less of 
testing. It puts the burden of test interpretation where it should be, but 
leaves the integration of the interpreted lest data with the case histoiy 
material to the skilled counselor who sees the individual and his situa¬ 
tion both objectively on paper and dynamically in interviews. It is not 
a worthwhile type of report in situations in which the counselor knows 
more about tests than the psychomctrist, for then the counselor can see 
more meaning in the test profile than the psychomctrist can put into 
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the write-up. Neither is it valuable in situations in which the psycholo¬ 
gists knows case study procedures well and the counselor, psychiatrist, 
ocial or personnel worker has not had extensive, specific experience in 
using test results. Nor is it likely to prove useful as a report from one 
agency to another, except when the other agency is an equally well- 
staffed counseling service. In such instances lull interpretation on the 
basis ol all available* supporting evidence is essential, as borne out bv 
the experience of more than one guidance center which has attempted 
to render testing sen ices to social agencies: without full interpretation 
the test results generally impress the users as being of little or no practical 
use. 

The objective, then, of the limited interpretation of test results is to 
put into the* barrels of the counselor a concise, verbal, occupationally - 
lather than test-oriented statement of the significance of the test scores. 
The counselor then relates them to other data already in his possession 
01 obtained as he works with the client. 

The fn maples which appl\ to the limited interpretation of test re¬ 
sults may be stated as 1) the interpretation ol each test score first in the 
light of the appropiiau* norm group or groups, 2) the 1 elation of that 
score and percentile* to observed beha\ior in the test situation, ;{) the 
1 elation of each such interpreted test score to any others which mav 
hau* bearing on its further interpretation, j) the modification of this 
interpretation in the light ol any personal data allotting the suitabilitv 
of the test content or of the norms, 5) the expression of these interpreta¬ 
tions in so far as possible first in psychological and then illustrati\el\ 
in broad occupational terms, and. (i) the summarizing of the interpreta¬ 
tions to yield a picture of a person and of his occupational potentialities. 

1) lire interpretation of each test score first in the light of the 
appropriate norm group or groups requires onl\ the \erbal statement 
of what appears in the profile of test results. For example: “O11 the 19 j 1 
edition of the American Council on Education Psychological Examina¬ 
tion he stood at the 97th percentile when compared to freshmen in mote 
than 300 collegers.” 

2) Elie relation of this interpretation to observed behavior in the 
test situation provides an opportunity to mention anything unusual 
which might have affected the client’s performance, such as resistance 
to taking the tests, undue tension, or concentration and a sWematic 
approach to the task at hand; e.g.. ‘Tie seemed impatient with the 
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discipline inherent in the test situation and wanted to skip the instruc¬ 
tions given before each practice test, but controlled these reactions and 
worked steadily on the subtests proper.” 

3) The relation of each test score to others which may have a bearing 
on its interpretation requires the mental review by the examiner of 
other data, and the mention in the report of any implications noted. 
These may consist of such things as the seemingly discrepant scores on 
two tests of the same aptitude or trait, and the congruence of or lack 
of agreement between two tests of different types of traits such as apti¬ 
tude and interest important in the same occupation. An example might 
be: “The evidence of aptitude for professional or executive endeavor 
provided by this intelligence test is not supported by the scores of the 
interest imentories administered.” 

4) The modification of interpretations such as those given above in 
the light of personal status a fleeting the suitability of test content or 
norms requires a reference to personal history data such as age. sex. 
education, and cultural background, and consideration of their resem¬ 
blance to those of the standardization groups. To illustrate*: “As the 
client is now 23 xears old, it is probable that his standing when compared 
to freshmen on the A.C.E. Psychological Examination is somewhat biased 
in his favor, for it has been demonstrated that scores on this test inn ease 
with age during the age range from 18 to 22. Even if his standing on this 
test is really somewhat lower than it seems, howexer, the indications are 
that he is well above the average college freshman in scholastic aptitude. 
This is borne out by his score on the Wechsler-Bellevue, for which the 
comparison is with adults in general and shows him to be in the very su¬ 
perior category.” 

5) The expression of test scores first in psychological and then in 
educational or vocational terms ensures both the scientific accuracy of 
the description of the examinee and its meaningfulness to the non-psy¬ 
chologists who often use the results. It provides an educational or occu¬ 
pational sketch of the individual which is more dynamic than a profile 
of test results. The test summaries and case summaries which follow in 
this and in the next chapter will serve to illustrate this principle better, 
but a brief illustration follows. The interpretations which have so far 
been given might be followed up with “The conclusion may be drawn 
that this client has the scholastic aptitude successfully to complete the 
work of a four-year liberal arts college, although his interest inventory 
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scores, which remain to be reviewed in detail, suggest that such work 
may not be exactly to his taste. Men with general ability comparable to 
his tend to gravitate toward professional or managerial work, whether 
or not they go to college.” 

f>) The summary picture of the person tested, pointing up his educa¬ 
tional and vocational potentialities and liabilities, brings together the 
gist of what has been brough out in connection with the results of 
specific tests. It attempts to integrate these findings into a dynamic pic¬ 
ture of psychological characteristics, from which occupational inferences 
may be drawn by those who know occupations, and of occupational 
possibilities indicated by the known validities of the tests used. The 
summary of the test leport from which the preceding excerpts have been 
taken attempted to implement this principle in the following way: 

“In summary, the client appears to be a young man of very superior 
mental ability, capable of graduating from a good university and rising 
to positions of considerable iresponsibility. His speed in the perception 
ol clerical detail, paiticulaily numeiical symbols, is comparable to that 
ol successful accountants. His superior ability to judge shapes and sizes 
and mentally to manipulate 1 them is indicative of promise in technical 
and artistic occupations. In ability to peiceive and analyze the effects of 
physical forces and the operation of mechanical principles he does not 
compare well with engineers or skilled artisans, although he compares 
ia\orablv with the gencial population. 11 is interests are not highlv 
developed in anv area, although they resemble somewhat those of men 
who are engaged in business occupations involving contact with other 
prisons and the management of enterprises. Experience has shown that 
many young men with abilities and interests such as this client’s enter 
business and find their way into executive positions.” 

Common errors in the preparation of reports of test results generally 
consist of violations of the above principles. Psychometrists tend to 
write in terms of test scoies or percentiles rather than in terms of apti¬ 
tudes and traits; psychologists without extensive contact with business 
and industry sometimes find it difficult to translate psychological char¬ 
acteristics into occupational behavior; and those who are not well 
grounded in both psychology and in occupations tend to overwork the 
brief and somewhat stereotyped interpretive phrases of the test manuals. 
I he result is test-centered, and the real significance and value of testing 
is lost. Perhaps an illustration of poor reporting, accompanied by an 
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improved version of the same report, will help to illustrate some of these 
points. To facilitate comparison they are reproduced in paired, original 
and revised, lines when changes seem desirable. 


In summary, the client’s very ■< 


Thigh scores on the Paper Form Board, the 
[superior ability to visualize space relations, 


performance subtests of the Wechsler-Bellevue Scale, and the Meier Art 
to think in non-verbal abstractions, and to judge the quality of iorm and 


Judgment Test,^ . , , r ....... (work or in 

y indicate a great deal of potential ability in art-4 . 
composition, J / [and in re¬ 

work involving spatial as well as aesthetic judgmental 

° ysuch as layout or pro- 

lated types of work, ) 

duction work in an advertising agency. The low score on the mechanical 
comprehension test suggests artistic rather than technical outlets for her 


spatial ability, 
ability to work in spatial arrangements. J 


lThe average 


| clerical scores and the 
[speed of perception of 


a8th percentile “effectiveness of expression and accuracy') , 

f * J. of written expies- 

clerical symbols and average clarity ) 

f should the\ 

sion” suggest that she will not be handicapped in these activities ^ ^ 


be involved in her work. 

would do well not to specialize in business detail or linguistic work. 


Hci 


, . . f better than that of q; percent of the general popula 

intelligence, which is 1 

[very superior, suggests ability to rise high in any work 

tion, should ensure ability to succeed in ari) area using her other aptitudes 
utilizing her other aptitudes and providing outlets for her interests. 


Briefly stated, the changes in the above summary were intended to 
produce a description of the client’s abilities, interests, and occupational 
promise, rather than a summary of her test scores. Perhaps this type of 
report can best be made dear, after the outlining of principles which 
has just preceded, by reproducing in toto a report written for use by a 
trained counselor working in the same agency and for possible sending 
to a similar college counseling bureau. Such a report follows, with data 
slightly changed and identity disguised. 
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REPORT OF TEST RESULTS: JOHN F. ATKINSON 
(Limited Interpretation) 

John Atkinson, a high school graduate, 19 years of age, was given two 
tests of scholastic aptitude. On the Otis-Self-Administering Test of 
Mental Ability he was at the 50th percentile when compared to college 
students, which would suggest that his chances of competing with college 
students and completing college work in an average college are reasona¬ 
bly good. On the 19 J4 Edition of the American Council of Education 
Psychological Examination he was at the 27th percentile when compared 
to college freshmen. His linguistic score was at the 33rd percentile and 
his quantitative scoie at the 26th percentile. 

As the A.C.E. test is a somewhat longer and more appropriate instru¬ 
ment, this suggests that John, while able to compete with college students, 
is likely to find himself in the lower third of the student body and will 
therefore find it necessary to apply himself more effectively than the 
avetage college student in order to achic\c satisfactory results. 

Sexeral tests of special aptitudes which are important in engineering 
and scientific occupations weie administered. On the Engineering and 
Physical Science Aptitude Tests his mathematical score was at the 4th 
percentile, his plnsical science comprehension score at the 12th percen¬ 
tile, and his mechanical comprehension score at the 39th; on the other 
hand his arithmetic scoie was at the 50th percentile, his formulation 
score at the 70th percentile, and vetbal comprehension score at the 74th, 
compared to men students in non-collcgiate technical courses. These 
scores suggest that the client has more aptitude for work of a verbal 
natuie than for mechanical or mathematical work. The relatively low 
standing in these latter areas is confirmed by the Minnesota Spatial 
Relations Lest on which he scored at the 30th percentile when compared 
to engineering lieshmen. On the O’Connor Wiggly Block his letter 
rating was C, which points up even more the lack of special aptitude 
in spatial visualization. 

One test of clerical aptitude was administered: The Psychological 
Corporation General Clerical Test. The total score on this test was at 
the 2<)th percentile when compared to clerical workers, the lowest pait 
score being that for clerical speed and accuracy at the 19th percentile 
and the highest part being verbal facility at the 49th percentile. These 
results fit in with the data indicating greater verbal facilitv than nu- 
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merical or spatial, but do not indicate special qualifications for clerical 
work. At the same time the scores are high enough to indicate reasonable 
chance of success in such employment if other things are favorable. 

Three measures of interest were obtained. On Strong’s Vocational 
Interest Blank John revealed a pattern of interests most similar to those 
of engineers, chemists, and other men engaged successfully in physical 
science occupations. His interests are also similar to those of teachers ol 
high school science, and to those of production managers and others 
engaged in semi-technical industrial work; they also lesemble those of 
musicians. According to Strong’s Blank his interests do not greatly 
resemble those of men in artistic woik, biological science, social welfare, 
business contact, or literary and legal occupations. Theie is some sign 
of interest in business detail occupations, including office worker and 
purchasing agent. 

The results of the Ruder Preference Record did not agiee very well 
with the results of the Strong Blank, although scenes on the two tests 
tend to confirm each other in most cases. According to the Ruder, John's 
interests are strongest in artistic, musical, social welfare, and mechanical 
activities. The moderately high interest in music and in technical work 
indicates some agreement with Strong's Blank, but there is a leal dis¬ 
crepancy between the two tests on artistic and social welfare interests. 

The Allport-Vernon Study of Values throws some light on this matter 
by showing a fairly high social welfare score and an average theoretical 
or scientific score, but confuses the issue by revealing a strong interest 
in material welfare, such as normally characterizes men in business 
contact occupations. What seems fairly clear from these interest test 
results is that John does have interests comparable to those of men who 
are successful in managerial work in industry; the picture is not clear- 
cut with reference to other interests. 

In summary, it would seem that John Atkinson is a young man of fair 
college aptitude, greater linguistic than scientific aptitude, and interests 
which most clearly resemble those of men engaged in managerial and 
supervisory positions in industry. His prospects for success in a four-year 
engineering or liberal arts college do not seem especially good, although 
such students do graduate from the less competitive colleges. On the 
other hand it seems likely that he would succeed in an industrial engi¬ 
neering course or in a business course aiming at administrative work, 
taken in an institution in which the competition is not too great. 

Full Interpretation. As previously stated, this type of report of test 
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results is most likely to prove valuable when reports arc prepared by 
well-trained and experienced vocational psychologists, particularly if 
they ate to be used by counselors, psychiatrists, and social or personnel 
woikers who have not been trained in the use* of test results. In such 
instances the psychologist shares the other workers’ ability to make a 
case study, and has, in addition, the knowledge of tests, and of their 
occupational significance, not possessed by his colleagues. The effective 
use of talents then calls for full interpretation by the psychologist, al¬ 
though their application in counseling or in selection and promotion 
may be made* bv other specialists, depending upon the nature of the 
situation and of the case. The counselor may need the data in connection 
with educational and vocational planning, the psychiatrist in connection 
with therapy which calls lor the most effective use of his patient’s abil¬ 
ities and interests, the social worker as an indication of the types of 
vocational rehabilitation which may be eflective, and the personnel 
worker as ari aid to making decisions concerning employment and ad¬ 
vancement. And the psvc hologist who functions as a clinical counselor 
will also find that the preparation of such a report is one ol the most 
effective methods ol forcing himself fully to explore the significance of 
lest results and personal historv data. 

The oh jc( f i ,'c of full interpretation is to tease the maximum of 
meaning from the lest results bv sv nthesi/ing them with other case-his- 
toiv material, at the- same time using one tvpe ol data as a check on the 
other in preparing an accurate and vivid description of the person being 
st inliecl. 

77/c* fn muffles guiding the preparation of lull interpretations of test 
results are the same* as the* six governing limited interpretations, with 
the addition of one more which follows after the* fourth in the list given 
on page r,;|. This principle specifies the necessity of viewing the test 
data in the* light of related case history material which it may confirm, 
contradict, or illuminate, or by which it may be confirmed, contradicted. 
01 illuminated. 

Viewing test results in the light of other case material requires that 
the interpreter be trained and experienced in case-history taking and 
in the* occupational and clinical significance of personal, socio-econonric\ 
educational, and avcreational data. The process involves the examination 
of intelligence test results in the light of educational attainment; the 
comparison of measured interests with interests as manifested in school 
subjects, leisure time activities, and previous occupational experience; 
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and the evaluation of special aptitude test scores in the light of accom¬ 
plishments in related activities. To illustrate: “The Co-operative General 
Mathematics Test for High School Classes was also administered. T he 
client had three and one-half years of high school mathematics, followed 
by college training in accounting, a master’s degree in business education, 
and three years as a junior accountant, in all ol which he worked with 
figures. When compared with the four-year norm group he is at the 4th 
percentile, while with the three-}ear group he is at the 32nd. This low 
score is congruent with his low quantitative store on the scholastic apti¬ 
tude test, his own statement that he feels weak in mathematics, and the 
fact that he failed the teaching examination in mathematics. The picture 
is clearly one of weakness in the mathematical area, although whether 
or not this weakness is the result of lack of aptitude, emotional malad¬ 
justment, or a combination of the two is not brought out by these data.’’ 

A complete report of test results in which full interpretation has been 
attempted is reproduced below, as the best way of conxexing an idea 
of the principles and method. 

REPORT OF TEST RESULTS: JAMES 1. FRANK 
(Full Interpretation) 

James is an eighteen year old high school senior, who came for help 
in the choice of a career. Specifically, he wanted to know whether or not 
he should go into engineering, llis father owns a manufacturing plant; 
he is interested in having the boy go to college and feels that he max 
be better qualified for administratee than technical work. His father 
thinks that industrial management would probably be a good field. Lire 
boy worked in his father’s plant during one summer but got a different 
job last summer, working as a bell hop in a resort. He preferred to go out 
on his own rather than into a job already made for him. He, too, feels 
that he is really more interested in administrative than in technical work. 

James was given the American Council orr Education Psychological 
Examination, two different forms a week apau. On the first, he scored 
at the 74th percentile compared to entering College Freshman; and on 
the second, at the 76th percentile. On both tests, his linguistic score 
was distinctly higher than his quantitative, which suggests that his and 
his father’s hunch that James is not as strong in the technical field as 
in others has a foundation in fact. 

James also took the Engineering and Physical Science Aptitude Test 
His scores, when compared to recent high school graduates applying 
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for non-collegiate technical training, were in the bottom decile for arith¬ 
metic reasoning, in the fourth decile for mathematics and formulation, 
almost at the 75th percentile in physical science comprehension, and 
in the top decile for verbal comprehension and mechanical comprehen¬ 
sion. These results suggest that, while James is weak in mathematical 
ability, he does have a rather high degree not only of verbal but also of 
mechanical aptitude. 

The Minnesota Spatial Relations Test was given, the letter grade 
being II — . As the norm group is the general population, it seems legiti¬ 
mate to conclude that James does have a relatively low degree of ability 
10 visualize spatial relations. 

The Purdue Pegboard was administered, James being near the 99th 
peicentile on all part scores and total scores when compared with college 
men. 

The Strong Vocational Interest Blank shows no primary interest 
patterns and no A ratings. The greatest concentrations of interests are 
in the physical sciences, technical and social welfare fields. James rates 
a B— as engineer and chemist, C as mathematician, B+ as mathematics- 
si ience teacher, B as production manager; B as personnel manager, B — 
.is Y.M.C.A. physical director and social science teacher, C-f as Y.M.C.A. 
sene tan, C as school supei intendent. In the other fields, James’s scores 
,ne scatteied B — \ and C’s. His interests are like those of many young 
men w ho go into business in that they are relatively undifferentiated. 
But he is somewhat stronger in the practical side of technical work and 
in the fields ol human relations and personal contact. 

The Bernrcuter Pcrsonalitv Inventory indicates that James is an emo- 
lionallv stable, somewhat dependent, extroverted, rather dominant, self- 
confident and ejuite sociable young man. 

I he rest of the results appear to agree quite well with the interview 
material. James’s extra-curricular and leisure-time activities are primarily 
social, and indicate not only interest but considerable skill in dealing 
with people. His grades in mathematics are poor but are acceptable in 
other subjects. His general ability is better than that of the average 
(ollege freshman, but he lacks some of the special aptitudes required 
for success in a technical curriculum. He has certain other aptitudes 
which would he assets to him, particularly his mechanical comprehension 
and his verbal ability. James would probably do well in a position such 
as his father’s, in which facility in understanding mechanical proc¬ 
esses is necessary, and in which finding some engineering activities con- 
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genial would help. His personality characteristics would also he an asset 
in the supervision and contact side of industrial management. 

In choosing a college, James would probably do well to select one in 
which he can follow a business administration or industrial engineering 
major. It would be helpful if summer employment could be obtained in 
an industry (other than his father’s) rather than in a field unrelated to 
his educational and vocational objectives. This would permit him to try 
himself out and get the experience of making his own way. It would 
make it emotionally easier for him to return to work in his fathei’s 
plant within a relatively brief time alter the completion of his education 
if that seemed desirable. 

Reports to Clients 

'The problem of preparing reports of test results for clients who have 
been counseled is a vexatious one. Ideally, the counseling of which test 
ing and test interpretation are a part should have been so conducted that 
the client (and his parents, if they are invoiced) has integrated the test 
results into his own thinking. He then has insights into their significance 
which match his understanding of his school record and his \ocational 
experiences and views them in \ery much the* same light. Just as he docs 
not think it necessary to have a written tecorcl of all of his jobs and of his 
performance on them, so he should not ne ed to want a wiitten repot t o! 
the results of his tests. When he does, it generally means that thee air 
thought of as a crutch of some sort. 

In a world in which there are crippled people crutches ate sometimes 
desirable. "They are to be frowned upon onh when the\ contribute to 
keeping a person partially crippled longer than he need be*. The fact that 
clients often want written repot ts indicates a need for a crutch in some* 
cases. When such requests are at all frequent the counselor should exam 
ine his practices, in order to find out why his clients seem to feel the need 
of something tangible to lean on. 

In the writer’s experience, to which appeal is made only because no 
studies seem to have been made of this question, clients have wanted 
copies of test reports only, 1) when testing has been overemphasi/cd, 2) 
when the discussion of test results was not successfully integrated with 
counseling, and 3) when the client’s own insecurity led him to believe 
that he could use a report of test results to sell himself to a potential em¬ 
ployer more successfully than he could on the basis of his experience, 
education, and conduct in the employment interview. Before* discussing 
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the form and content ol such written reports to clients as are prepared, it 
may therefore be wise to deal briefly with methods of handling these 
problems. 

Methods of Handling Client Requests for Test Reports. Methods of 
avoiding overemphasis on testing and of successfully integrating test re¬ 
sults with counseling have been discussed in some detail in Chapter 21, 
and need not be gone into here. But when a client recjucsts a written 
lepott ol test lesults which have already been discussed such preventive 
techniques are no longer usable. In the experience of the wiiter and the 
counselors whose work he has supervised it has generally been effective 
to ask the client how he expects to use the report. When emotionally in¬ 
secure clients reply that they will show it to prospective employers as 
evidence ol their cjualifications for a job, the counselor asks the client to 
put himself in the place of the employer. He is to consider how he would 
react to an applicant who applied for a position and produced a sheet of 
test results to pro\c his cjualifications. This generally brings about a 
realization ol the artificiality ol the technique and of the fact that, most 
employeis still judge in terms of other types of evidence. If this realization 
does not come at once it can be helped by asking how often employers 
have asked the client if he had any test results to show his qualifications, 
or had requested that he obtain such. The client then generally recog¬ 
nizes that il emj)lo\ers were inclined to depend upon or to be much im¬ 
pressed by the results of tests given by organizations other than their own 
they would ask for them more often. 

I he client may state that he would like to have the test report for in¬ 
cidental use when talking with enqdoyers, that they are tending to become 
test conscious and might be impressed by the fact that the client had taken 
the trouble to study himself so thoroughly before applying for a job. The 
counselor can use this as an opening for discussing job-hunting techniques, 
introducing the client to books such as the Edlunds’ (236). This helps the 
client to see that he can best demonstrate the care with which he has gone 
about seeking employment by an intelligent understanding of the com¬ 
pany for which he wants to work and of the ways in which he can serve 
it, and that merely having a report of tests (of someone clse’s insights 
lather than of his own) is likely to be of little value. The counselor’s 
suggestion that any employer interested in obtaining a report of the 
client’s test results might write for such a report with the client’s permis¬ 
sion generally appeals at this point as a more effective way of putting 
the test lesults to work than taking a report with him. 
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In certain other cases it may be desirable to send copies of test repoits 
to parents, school principals, or potential employers who may not be well 
qualified to interpret test results and whose discretion in their use can 
not be taken for granted. It should be recognized that the sending of re¬ 
ports to parents is at best a compromise with an undesirable situation, 
and that if reports from the counselor to parents arc necessary they should 
really be made as a part of counseling. School and industrial users of test 
results are in a different category, but even in their case it would be pref¬ 
erable from both client’s and school’s or industry’s point of \ iew if the 
psychologist were to make his report to the principal oi employer cm 
the basis not only of familiarity with the client but also with the school 
or work situation, helping the recipient of the repot t to integrate client- 
data with situational-data. Because written reports ate often the best 
possible compromise in such situations, methods of interpreting the 
results in writing are discussed in the next paragraphs. 

The objectives of the report to parents or other laymen arc the increas¬ 
ing of their understanding of the abilities, interests and pcisonality of the 
client. As the recipients of these reports are not thoroughly trained in 
psychology, testing, or counseling the report aims to describe these not in 
psychological terms but in terms of their educational and \ocational im¬ 
plications. 

The principles governing the writing of reports to laymen may then - 
fore be formulated as follows: 1) test names, test scores, and most psycho¬ 
logical trait and aptitude terminology should be avoided in favor of 
descriptions of probable educational and \ocational performance; 2) 
these descriptions should be phrased in broad terms and illustrated with 
typical concrete examples; and, 3) a brief summary giving a dynamic- 
picture of the individual should bring together the interpretations of the 
more specific aspects of probable behavior. Each of these principles is 
taken up briefly below. 

1) The substitution of descriptions of probable behavior in educational 
and vocational situations for psychological terminology requires the mak¬ 
ing of statements concerning probable success in college, technical insti¬ 
tutes, and appropriate types of occupations rather than the description of 
mental status; it involves comparisons of the interests of the client with 
those men in occupations which he is or perhaps should be considering, 
rather than in terms of letter ratings or percentiles. 'These varied actuarial 
comparisons are both more meaningful and less traumatic than descrip¬ 
tions of ability levels or personality traits would be. It was written) of one 
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high school sophomore with an intelligence quotient of 90 and no As 
or lVs on Strong’s Vocational Interest Blank: “His chances of doing good 
work in a college preparatory course are slight, but it is probable that he 
(ould complete the graduation requirements of the general, commercial, 
or trade curriculum. His general ability is equal to that of many men who 
have succeeded in skilled trades, such as machinist, printer, or plumber, 
but he might have difficulty competing in the more demanding aspects 
of technical work such as mathematics and blue print reading; he could 
probably compete successfully with routine clerical workers such as stock 
cleaks and genetal clerks, but would nett be likely to rise to a position of 
1 esponsibility in office work; as a machine operator or assembly worker 
in a fac tory he would be competing with men of his own ability level, and 
could, other things being equal, rise to a position of leadership as a fore¬ 
man or supervisor. His interests do not resemble those of men engaged 
in engineering, business, or skilled occupations, suggesting that he ma\ 
find most satisfaction in work which does not require a great deal of 
specialized information; instead, he may find his satisfactions more in 
his contacts with other people or in outside activities. Mail) men with 
interests and abilities like his are more interested in a job with regular 
horns, steady pay. opportunities to make friends with other people, and 
time off in which to indulge in special interests such as sports, association 
with men friends, and reading newspapers and magazines, than in the 
exact nature of the work they do.” 

u) The couching of such statements in terms sufficiently broad to a\oid 
the appearance of prescription but concrete enough to be meaningful, 
and the use of specific examples which are illustrative rather than limit¬ 
ing, is perhaps the most diffic ult pait of writing reports of this type. Thev 
requite considerable knowledge of occupations and of the world of woik 
on the part of the pet son writing the report. Considerable help can be 
obtained from the literature, e.g., through the use of occupational norms 
such as those published for intelligence tests after both World Wars 
(see Ch. 6) and for various types of tests by projects such as the Minnesota 
Employment Stabilization Research Institute (589), and through famil¬ 
iarity with studies of workers such as those resulting from the Western 
Electric experiments (637) and the Yankee City studies (909). The illus- 
tration in the preceding paragraph should serve for this principle also. 

3) The summary statement at the close of the report serves to pul! 
together the gist of what has been said before and to avoid the overem¬ 
phasis of isolated statements which might happen to impress the reader. 
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Of the boy partlv described earlier in this discussion it might he said, lor 
example, “In summary, John is a hoy who should he able to complete a 
high school education in the general, commercial, or trade curriculum i! 
he so desires. His abilities and interests suggest that he is most likely to 
find success and satisfaction in the middle range of occupations, and, in 
that group, most probably in the less competitive general office or factory 
jobs. It seems probable that he will derive more satisfaction from em¬ 
ployment which permits him to ha\e interesting friendships and recre¬ 
ational outlets than from some one type of work requiring special prep¬ 
aration over a long period of time. John’s aim might well he the ability 
to shift readily from one type of factory operation to another, skill in 
getting along with people, and knowledge o 1 a variety of industrial 
processes which make a valued employee in his own company and a very 
employable applicant in the eyes of other concerns.” 



CHAPTER XXIII 
ILLUSTRATIVE CASES: 
DATA AND COUNSELING 


THERE arc so many diflcrent types of tests, so many tests of eacli type, 
and so main studies of the validity of some of these tests, that it is 
difficult in books on testing to find adequate space for discussion of the 
ultimate purpose of testing: achieving insight (by the client) and an 
understanding of a pci son (by the counselor). An effort has been made 
to deal systematically with this topic in Chapter 20, and to treat prob¬ 
lems of leporting test results to clients in Chapter 21; it still remains, 
however, to describe the diagnosis and counseling of a number of indi¬ 
viduals, and to report their subsequent vocational adjustments, in order 
to show how the test data we re used and how well the deductions from 
them foieshadowed subsequent developments. 

Opportunity should also be provided for the student of testing to put 
to woik the insights which he has developed from the contents of this 
book and bom his own experience, by presenting the test data and essen¬ 
tial case material in such a manner as to permit him to make his own 
appiaisals before reading those made by the counselors who actually 
handled the cases. The leader may also want to attempt to predict the 
subsequent educational and occupational histories of the boys and girls, 
men and women, described by the case summaries. It should prove in¬ 
stinctive to see how well the reader’s and the counselors’ insights corre¬ 
sponded with what ac tually took place. From such comparisons one gains 
new insights into the meanings of test scores, the interplay between vari¬ 
ous types of personal characteristics, and the interaction of personal 
characteristics and social environment. 

The seven cases described in this and the following chapter were 
tested and counseled by the writer, his associates, and his students in a 
number of different places at various times during the past 15 years. 
These places were the Cleveland Guidance Service of the National Youth 
Administration in Ohio, the Guidance Service operated by Clark Uni- 

589 
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versity in co-operation with a number of high schools in central Massa¬ 
chusetts, the Psychological Services Branch of the AAF Regional and 
Convalescent Hospital in Miami Beach and Coial Gables, Florida, and 
the Guidance Laboratory of Teachers College, Columbia University. For 
ethical reasons, even the place of work with any individual case and the 
identity of the counselor (sometimes the writer but sometimes an associ¬ 
ate or student), as well as all more personal identifying data such as 
names and institutions, are disguised. 

The method of presentation requires a word of explanation in order 
that the reader may obtain the maximum desired benefit from the ma¬ 
terial. The case histories arc divided into three sections: 1) case sum¬ 
maries and test profiles, 2) counselors’ interpretations and the immediate 
outcomes of counseling, and 3) follow-up reports. Within each of these 
three sections the cases are presented in the same order, beginning with 
boys and girls first counseled as high school students and closing with 
experienced men and women who came for counseling because they 
were considering changing occupations. Thus Tom Stiles’ background 
data and test profile are presented first, followed by those for Marjorie 
Miller, Ralph Sheridan, etc. Then the sequence begins again, this time 
for giving the counselors’ interpretations and the plans, if any, made by 
the client. (It may be of interest that this material was written up for 
publication before the follow-up data were obtained, to avoid contamina¬ 
tion by hindsight.) The sequence then begins over again for the last 
time, to show the current status of each of the cases in turn and to 
consider the validity of the counselors' appraisals. It is suggested that 
readers interested in obtaining the maximum possible value from this 
chapter make their own diagnosis (and prognosis, if so inclined) after 
reading the background material and studying the test profile of each 
case, add anything they wish to this after reading the account of the 
counselors' work, and then compare their notes with the follow-up data 
in the next chapter as these are read for the first time. 

Background Data and Test Profiles 
The Case of Thomas Stiles : When is an Engineer an Engineer? 

Tom was 17 years old, in good health, of average height and weight, a 
high school senior when he came to the counselor. He was enrolled in the 
academic course, in which he liked the work in mathematics and science 
better than anything else, and cared least for English and history. His 
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leisure-time activities consisted largely of spectator sports; he liked also 
to read populai scientific and adventure story magazines. As a younger 
boy he had done odd jobs at home, and since then had had part-time and 
summer jobs working as a helper on a truck, operating machines in a 
shoe factory, helping in a garage, and working in a machine shop. Some 
of these jobs had been for no pay, others, the more recent, had been 
paid work. 

The student’s lather was an operative in a shoe factory; the mother 
kept house, and several siblings, all younger than Tom, were still in 
sc hool. 

Tom stated that he was interested in machines, having lived among 


various types of machinery all his life; his junior high school ambition 

Iigure 15. 

GRADES: THOMAS STILES 

Marks: / 2th Grade 

Subjects 

yth Grade ioth Grade nth Grade (ist sem.) 

English 

C D E III-C IV-C 

Latin I 

C 

Civics 

B 

World History 

B 

Prob. of Democracy 

C 

U.S History 

C 

Algebra I 

B 

Plane Geometry 

B 

Review Math. 

C 

Math. (Gen.) 

B 

Solid Geometry 

C 

Physics 

D 

General Chem. 

C 

Phys. Education 

D 

bad been to be a chest* 

1 engineei or marine engineer, an ambition which 

had broadened to include work with almost any tv pc of engine: steam, 

diesel, or airplane, esj 

jecially the last-named type, as “it is the coming 

field.” He thought he 

would like engineering training, but was not cer- 


tain of his choice. Asked what he would like to be ten years hence he 
replied: “Foreman or superintendent in an airplane factory.” 

The cumulative record in the school office showed that Tom’s high 
school work was mediocre. As shown in the accompanying chart, he had 
failed junior English, did poorly in physics, had made only C’s in math¬ 
ematics after the 10th grade, and was doing no better in chemistry. Hjs 
1 . O. on the llenmon-Nelson Test of Mental Ability, administered at the 
beginning of his junior year and recorded on the school record, was 106. 
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The profile of test results obtained by (lie counselor during the first 
semester of Tom’s senior year in high school is shown in Figure if). 

Tom’s questions were: “Should I go into engineering? I am interested 
in engines. Should I continue my education in order to prepare for such 
work? What about engineering college?” 


Figure 16. 

TEST PROFILE: THOMAS STILES 


Scholastic Aptitude 

A.C.E. Psych. Exam. 

Norms 

Coll. 

%iU 

14 


Otis S.A. I. Q. 101 

Students 

1 9 

Reading 

Nelson Denny: Yocab. 

Fresh. 

1 


Paragraph 

“■ 

30 

Achievement 

Coop. Social Studies 


11 


Coop. Mathematics 


74 


Coop. Natural Sciences 

“ 

5 

Clerical Aptitude 

Minn. Clerical: Numbeis 

GenT. 

6 


N ames 

Clerks 

6 

Mechanical Ability 

O’Rourke Mechanical Aptitude 

Men in 

f >7 

Spatial Relations 

Minn. Paper l orm Board, Rev. 

Gen’l. 

Coll. 

(iy 

Personality 

Calif. Pers.: Self 

Fresh. 

35 


Social 


•1° 


Total 


40 

1. Biological Sciences 

VOCATIONAL INI LRLSIS 

BT 4. Social S< 

deuces 

C 

2. Physical Sciences 

A r )m Business 

Detail 

B- 

3. Technical 

6. Business 

C .’on tact 

B 

Carpenter 

A 7. Literary 


B 

Policeman 

Farmer 

A 

BT 




Exercise /. 

a) Prepare a written analysis of the test results of this case as though for 
transmittal to her grade adviser. Use the sample on page 582 as a model. Do 
this before reading further, and save your report to compare with the appraisal 
actually made by the counselor. 

b) Outline the plans which you think are most suitable for this client, in¬ 
cluding your approach to counseling the client, in the light of your psychometric 
report. Save these for comparison with the counselor’s conclusions and with the 
results of counseling. 

Marjorie Miller : A Case of College and Choice of a Scientific Field 

Marjorie was 1G years old when counseling began. She was then an 
academic high school senior, in excellent health, of average height and 
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weight, very good looking, friendly, and mature in manner. She reported 
that she liked chemistry, languages, and history best, and had no special 
dislikes in school. Her leisure-time activities consisted of photography, 
dramatic club, work on the school paper, scouts (Mariner, in charge of 
younger troop), participant sports, dancing, and painting; in this last 
connection, she had entered some of her work in local exhibits. Her 


reading consisted largely of school-required books. Her part-time and 
summer work experience consisted of selling Christmas cards and work¬ 
ing in a gift shop. 

'This pupil's father was employed as an executive by an insurance 
company; the mother was a housewife; there were no brothers or sisteis. 



she wanted to “specialize in some definite subject so as to be ready to 
woik” alter giaduation. She was considering two nearby colleges of good 
but not outstanding leputation, neither of which was actually a liberal 
aits college but both of which had good professional and business au¬ 
ricula. Her occupational preferences were chemical research (“I think 
I would like the work”), dietetics (‘‘I like the subject”) or the teaching of 
(hemistn in high school 01 nursing school (“II I had to teach, I would 
want it to be chemistry”), but she was undecided as to her actual choice. 
She had previouslv thought of art, surgery, medical laboratory work, and 
tea room management, in that order, beginning in the last years of 
grade school. Ten years hence she wished to be “connected in some way 
with science or medicine.” 

The high school record showed that Marjorie had done uniformly 
superior work. Her grades were close to or above 90 in all subjects 



59-1 


APPRAISING VOCATIONAL FITNESS 


except for 85 in Review Mathematics and Senior Algebra and 80 in 
Typewriting. The only patterning revealed is perhaps slightly less 
strength in mathematics than in the more verbal subjects. She said “I 
would rather spend time on chemistry than on any other subject. I am 
interested in math but find it rather hard. I have never taken social 
studies [this despite a current course in history] but I’m sure I’d like 
them.” The principal described Marjorie as “a brilliant girl with unusual 
ambition, with many interests, particularly in science.” Marjorie’s test 
profile, obtained during the first scmcstci of 12th grade, is reproduced 
in Figure 18. 

The statement of the problem as seen by Marjorie was to choose be 


Figure 18. 

TEST PROFILE: MARJORIE MILLER 

Norms %i/e 

Scholastic. A.C.E. Psych. Exam. Coll. 72 

Aptitude Fresh. 

Otis S.A. I Q,. 124 Students 87 

Reading Nelson Denny: Vocab. Fresh. 80 

Paragraph “ 96 

Achievement Coop. Social Studies “ 59 

Coop. Mathematics “ 90 

Coop. Natural Sciences “ 70 

Aptitudes Minn. Clerical: Numbers Clerical 57 

Workers, F. 

Names “ 34 

Minn. Paper Form Bd. Rev. Coll. 70 

Fresh. 

Personality Calif.: Self-Adjustment “ 80 

Social-Adjustment “ 40 




VOCATIONAL 

INTERESTS 




Strong 

Allport - 


Strong 

Allport - 



Vernon %ile 



Vernon ° 7 cile 

Biological Sciences 


90 

Domestic 



Physician 

B + 


Housewife 

A 


Dentist 

B 





Nurse 

A 


Business Contact 



Artist 

B- 

7 

Life Insurance 



Math-Sci. Teacher 

B- 


Saleswoman 

C 

67 

Social Sciences 


35 

Literary 



Y. W. Sec’y. 

B 


Author-J ournal- 



Social Wkr. 

B + 


ist 

c+ 


Social Sci. Teacher 

C 


English Teacher 

c 





Lawyer 

c 


Business Detail 



Librarian 

c 


Office Clerk 

c-p 


Femininity 

72nd %ile 

Stenographer 

c+ 


Political (Prestige) 


47 




Religious 


92 
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tween dietetics and chemical research, to decide what kind of training 
to get and where, and to find out more about the kinds of jobs that 
might be available to her after completing college. 

Exercise 2. 

a) Prepare a written analysis of the test results of this case, as though for 
transmittal to his grade adviser. Use the sample on page 582 as a model. Do 
this before reading further, and save your report to compare with the appraisal 
actually made by the counselor. 

l>) Outline the plans which you think are most suitable for this client, in 
eluding your approach to counseling the client, in the light of your psychometric 
report. Save these for comparison with the counselor’s conclusions and with the 
results of counseling. 

Ralph Sheridan : A Case of College and Finances 

Ralph was a 17-year-okl high school senior when tested, a boy of some¬ 
thing above average height, beastly built, in good health, pleasant to look 
at and to talk with. He was enrolled in the college preparatory course, 


Figure 19. 

GRADES: RALPH SHERIDAN 






1 2 th Grade 

Subjects 

9 th Grade 

10th Grade 

nth Grade 

to date 

History 



90 


English 

- 

92 

9 1 

9 1 

French 


86 

86 


Latin 

— 

80 

83 


Geometry 

Trigonometry 


88 


88 

Rev. Algebra 
Chemistry 



85 

88 

Physics 

Bookkeeping 


9 1 


r-- 

00 


and gave English and mathematics as his favorite subjects, and foreign 
languages, especially Latin, as his least liked. His leisure-time activities 
consisted of hunting, fishing, trapping, and other solitary or small-group 
outdoor activities; he was active in debating at school; his reading con¬ 
sisted largely of achenture and historical fiction. During spare time and 
summers he had worked as a berry picker (when younger), in a general 
store, and on a farm. 

The Sheridan familv consisted of Ralph’s father, who owned and 
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operated a combination general store and service station; his mother, 
housewife; and younger brothers and sisters. 

Ralph expected to continue his education after graduation, had defi¬ 
nitely decided to be a civil engineer but did not know which college to 
go to or how to finance it. His second and third preferences consisted 
of lumberjack and farmer, these latter because* he “liked the work”; 
engineering was chosen because ‘Aon can make money and the work is 


FIGURE 20. 



TEST profile: RALPH SHERIDAN 

Norms 

Vale 

Scholastic 

A.C.E. Psych. Exam. Coll. 

<14 

Aptitude 

lresh. 



Otis S.A. I. Q. 125 Students 

89 

Reading 

Nelson Denny: Vocal). Fresh. 

99 


Paragraph 

99 

Achievement 

Coop. Social Studies 

95 


Coop. Mathematics 

82 


Coop. Natural Sciences 

77 

Clerical 

Minn. Clerical Numbers Gen'l. 

26 

Aptitude 

Cler ks 



Names “ 

7 i 

Mechanical 

Men in 


Ability 

O'Rourke Mechanical Aptitude Gen'l. 

8b 

Spatial 

(;di 

50 

Relations 

Minn. Paper Form Board, Rev. Fresh. 


Personality 

Calif.: Self-Adjustment 

3 ° 


Social ‘‘ 

20 

VOCATIONAL INTLRLSTS 

Biological Sciences C-f Social Sciences 

C + 

Physical Sciences 

B Business Detail 

B 

Technical 

Business (iontact 

A 

Carpenter 

C-f- Literary 

B 

Policeman 

B- 


Farmer 

B-f 



rather pleasant.” Ten years hence he wanted to be ”ali\e and rich,” the 
first part of which ambition seemed understandable during the Battle* 
of Britain. 

The cumulative record showed that Ralph’s high school work had been 
of college caliber, as all but his Latin grades had been 85 or abo\e, and 
even they had been above 80. Pattern analysis showed that his grades in 
verbal subjects were slightly, but perhaps not significantly, higher than 
in quantitative subjects. 

The problem, as expressed by Ralph, was to choose an engineering 
college and to find a way of financing it. 
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Exercise j. 

ii) Prepare written analysis of the test results of this case, as though for 
transmittal to his grade adviser. Use the sample on page 582 as a model. Do 
this before reading further, and save your report to compare with the appraisal 
actually made by the counselor. 

b) Outline the plans which you think are most suitable for this client, in¬ 
cluding your approach to counseling the client, in the light of your psychometric 
report. Save these for comparison with the counselor’s conclusions and with the 
results of counseling. 

Paul Manurlli: .1 Problem of Choosing a College Major 

Paul was a 17-year-old high school senior taking college preparatory 
woik and enjoying mathematics and science most while caring least for 
"history, etc.” He was a very tall, well-built individual with excellent 
health, a pleasant appearance, and agieeable manner. His leisure-time 
activities consisted largely of paiticipant spoils (in which he excelled), 
pai ties, and dancing. Part-time and vacation work occupied a good deal 
of his time, and consisted ol cooking, soda jerking, and mowing lawns; 
while in junioi high school he had worked as a caddy, and had raised 
vegetables and sold them. 


ElOl'RC 21. 

OKAPI S. PAI’I MANUr.I.LI 


Marl tr 


Subjecti 

()th Grade 

loth Grade 

nth Grade 

1 2 th Grade 

Drawing 

9 ° 




Ancient 1 listen y 

U S History & Gov't 

90 



90 

English 

90 

90 

90 

90 

Public Expression 

Latin 

90 

90 

95 

95 

Erenth 


85 

95 


Algebra 

Geometry 

90 

90 

9 ° 

(Solid) 85 

College Chemistry 
College Physics 

Physical Education 

85 


00 

85 


The Manuelli family included the father, employed as an operative 
in a local factory; the mothei, a housewife; an older brother who worked 
in a factory like the fathei; an older sister then in training as a nurse; 
and two younger sisters. 

Paul was not sure about continuing his education beyond high school 
but hoped to be able to go to engineering school. He had saved his 
summer earnings, but needed more money to help finance his education. 
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He was considering engineering and law as occupations, the former 
because he liked the related school subjects, the latter because he enjoyed 
debating and public speaking. He was thinking of West Point as a means 
of combining his engineering interest with the possibility of war, which 
was then going on in Europe. 


Figure 22. 

TEST PROFILE: PAUL MANUELLI 



Norms 

%il< 

Scholastic 

A.C.E. Psych. Exam. Coll. 

87 

Aptitude 

Fresh. 



Otis S.A. I. Q. 123 Student 

84 

Reading 

Nelson Denny: Vocab. Fresh. 

97 


Paragraph “ 

98 

Achievement 

Coop. Social Studies 

73 


Coop. Mathematics 

94 


Coop. Natural Sciences “ 

24 

Clerical 

Minn. Clerical: Numbers Gen’l Clerks 

26 

Aptitude 

Names “ 

53 

Mechanical 

O’Rourke Mechanical Aptitude Men in 

53 

Ability 

Gen’l 


Spatial 

Minn. Paper Form Board, Rev. Coll. 

3 

Relations 

Fresh. 


Personality 

Calif.: Self-Adjustment 

95 


Social “ “ 

9 ° 


Total “ 

95 

VOCATIONAL INTERESTS 

Biological Sciences B Social Sciences 

B- 

Physical Sciences 

A Business Detail 

c+ 

Technical 

Business Contact 

B + 

Carpenter 

C-b Literary 

B 

Policeman 

B + 


Farmer 

B 



Paul’s school record showed a high level of achievement, his grades all 
being 85 or better and the bulk of them 90. Ilis achievement in verbal 
subjects was slightly higher than grades in quantitative subjects, but the 
difference was not great enough to be conclusive. He had been given the 
Otis Quick-Scoring Test of Mental Ability, Gamma Form, at the begin¬ 
ning of the i ith grade, and had been given an I. Q. of 113; it was noted 
on the cumulative record, however, that he had ranked 9th in a class of 
185 pupils, suggesting that there might have been something wrong with 
the testing. The raw score was 52; a recheck of the I. Q. shows that this 
is the equivalent of an I. Q. of 113. Other notations showed that he was 
very well thought of by the school staff, both as a student and as an 
athlete. 
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The problem as Paul saw it was to find a way to finance a higher 
education, and to decide in which field to major. Although attracted 
to both engineering and to law, he really had no definite idea as to what 
he wanted to do. As engineering specialization begins early, he felt the 
need to choose or reject it during the last year in high school. 

Exercise 4. 

a) Prepare a written analysis of the test results of this case as though for trans¬ 
mittal to his grade adviser. Lise the sample on page 582 as a model. Do this 
before reading further, and save your report to compare with the appraisal 
actually made by the counselor. 

b) Outline the plans which you think are most suitable for this client, in¬ 
cluding your approach to counseling the client, in the light of your psychometric 
report. Save these for comparison with the counselor’s conclusions and with the 
results of counseling. 

fames G. Revere: A Case of Dissatisfaction and Desire to Change Oc¬ 
cupations 

Mr. Re\cre was a 29-year-old credit clerk, single, a graduate of an 
academic high school in the small city in which he was working at the 
time he lust came for counseling. lie was of a\crage height, stocky, and 
getting slightly bald around the temples in a way that made him look 
older than his age. He was personally attractive, open in his manner 
and fluent of speech with an interesting touch of humor and cynicism. 
He dressed conservatively and well. 

The client’s first full-time job (after high school graduation) was a 
short-lived position as proof boy in a publishing house, after which he 
left home to take a brief course in diesel engineering. For the next six 
months he was unemplo\ed, then he took a temporary position with lm 
present employers. He had been with them since, except for a period of 
military service. He took some coircspondencc work in diesel engines 
while in uniform. The work as credit clerk was satisfactory insofar as pa\ 
and stability were concerned, but held no particular challenge; it seemed 
like a blind alley. 

The problem, as Mr. Revere saw it in the first interview, was to “dis¬ 
cover what I am best suited to do,” so that he might plan a suitable 
program of night school study and prepare for a more promising occupa¬ 
tion. He was not certain just what he wanted from this future occu¬ 
pation, but suggested three things which seemed important to him: a 



600 


APPRAISING VOCATIONAL FITNESS 


substantial income, satisfying work, and opportunity to make use of his 
mechanical and clerical training. 

Exercise 5. 

a) Prepare a written analysis of the test results of this case as though for trans¬ 
mittal to his adviser. Use the sample on page 582 as a model. Do this before 
reading further, and save your report to compare with the appraisal actually 
made by the counselor. 

b) Outline the plans which you think ate most suitable lor this client, in- 


Figure 23. 

TEST PROFILE: JAMES G. REVERE 


Test 

Norm r 

Percentile 

Wechsler-Bellevue Adult Intelligence 


Test 

Total I. Q.—123 

General Population (adults') 

95 th 

Performance I. Q.—110 

Verbal I. Q. —131 

Otis Quick Scoring Test of Mental 

Ability (Gamma A) 

I- Q.-127 

H.S.-College Population 

99 th 

Michigan Vocabulary Profile Test 

Human Relations 

Grade 1 2 students 

(At Mean for Bus. Admin. Sr’s) 

98th 

Commerce 

College Seniors 

( + 1 above Mean for Bus. Ad. S.) 

84th 

Government 

Grade 1 2 students 

(At Mean for Bus. Admin. Sr’s) 

CjBth 

Physical Science 

Grade 12 students 
( + 1 above Mean for Bus. Ad. S.) 

96 th 

Biological Science 

Grade 12 students 

r )Oth 


(—4 below Mean for Bus. Ad. S ) 


Mathematics 

Grade 12 students 

(—3 below Mean for Bus. Ad. S.) 

84th 

Arts 

Grade 12 students 

(At Mean for Bus. Admin. Sr's) 

84th 

Sports 

Grade 12 students 

(T3 above Mean for Bus. Ad. S.) 

98th 

Minnesota Test for Clerical Workers 

Numbers 

Male Clerks 

16th 

Names 

Male Clerks 

40th 

Pennsylvania Bi-Manual Work Sampl 

e 


Assembly 

Male Industrial Workers 

23rd 

Disassembly 

Male Industrial Workers 

3 Ist 

Minnesota Paper Form Board 

H.S. Graduates who were applicants 


(Rev. Ed.) 

for a technical course 

38th 


Jr.-Sr. Vocational School students 

50th 

O’Rourke Mechanical Aptitude Test Applicants for Mechanical Jobs 

93 rd 

Bennett Mechanical Comprehension 

Test 

Candidates for Technical Courses 

87th 
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Figure 23 (Continued) 


Strong Vocational Interest Blank for 

Men 

huder Preference 

Record 


Letter Score 


Percentile 

I. Scientific 


Scientific 

39 th 

A. Biological 




i. Physician 

C 4 - 



2. Dentist 

C + 



3. Psychologist 

c 



4. Artist 

B- 

Artistic 

91st 

B. Physical 




1. Architect 

B 



2. Engineer 

B + 



3. Mathematician 

C-f 



4. Chemist 

B 



II. Technical-Mechanical 


Mechanical 

75 th 

1. Production Manager 

B + 



2. Math-Science Teacher 

C + 



III. Social Welfare 


Social Welfare 

3rd 

1. Y Secretary 

c: 



2. Personnel Manager 

c + 



3. City School Supt. 

c 



4. So<ial Science Teacher 

c 



5. Minister-Rel. 

c 



6. YMCA Physical Dir. 

c 



IV. Business Detail 




A. Clerical 


Clerical 

81 st 

1. Office Worker 

B- 



B. Computational 


Computational 


1. Accountant 

B- 



2. Purchasing Agent 

A 



V. Business Contact 


Persuasive 

82 nd 

1. Sales Manager 

B 



2. Life Ins. Salesman 

C 



3. Real Estate Salesman 

B- 



VI. Literary-Legal 


Literary 

70th 

1. Author-Journalist 

B- 



2. Advertising Manager 

B 



3. Lawyer 

c+ 



\ II. Miscellaneous 




1. C.P. Accountant 

B- 



2. Musician 

B 

Musical 

48th 


dueling your appioadi to counseling the client, in the light of your psuhonietrit 
report. Sa\e these ioi comparison with the counselor’s conclusions and with the 
results ol counseling. 

Ruth Ann Desmond: A Case of Dissatisfaction and Ill-Defined Objectives 

Miss Desmond was a 23-ycar-old high school teacher of business sub 
jeets, a tall, very slender, shy young woman with a warm smile which 
appeared frequently during interviews. She had graduated from the state 
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university and subsequently taught for two years; it was during her thini 
year of teaching that she came to the counselor. In college Miss Desmond 
had been most interested in mathematics, and had chosen accounting ;is 
a practical application of her interest. She had worked during her sum 
mer vacations, including those after she graduated, as an clcctrical-umi 
assembly girl, salesgirl, stenographer, and finally junior accountant. JJ< j 
brief experience in this last field had been of a routine nature, and led 
her to conclude that she did not want that kind of work. Investigation 
of the work of her associates in the accounting office did not improve the 
picture, even with promotions in mind. Her present job as commercial 
teacher appealed to her no more than the others: the students did not 
seem really interested in business, and this made teaching an unrewarding 
activity. 

Idle client’s other activities and interests included miscellaneous social 
activities, reading historical novels, and photography. Her older brother 

Figure 24. 


TEST PROFILE: RUTH ANN DESMOND 


Test 

A or ms 

Percentile 

A.C.E. Psych. Exam. 

College Fresh. 

79th 

Quantitative 

“ “ 

77 

Linguistic 

It it 

81 

Co-operative Gen’l. Culture 

Current Social Problems 

<< u 

94 

Mathematics 

(< <( 

9 1 

Science 

a u 

87 

Social Studies 

«< << 

85 

Literature 

<< << 

84 

Fine Arts 

(( u 

65 

Minnesota Clerical 

Names 

Clerical Workers 

25 

Numbers 

“ 

21 

Purdue Pegboard 

One Trial 


Right hand 

Factory Applicants 

87 

Left hand 

“ “ 

81 

Two hands 

“ “ 

40 

Assembly 

<< U 

85 

MacQuarric Mechanical Ability 

Total 


70 

Dotting 


99 

Tapping 


75 

Tracing 


55 

Copying 


64 

Pursuit 


53 

Blocks 


60 

Bennett Mechanical Comprehension 


W.W. 

Waves 

70 

Minnesota Spatial Relations 

Civilian Adults 

C- 
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Figure 24 (Continued) 



Interest 





Strong's Blank 

Grade 

Kuder 

%ile 

Allport’Vernon %ile 

Author 

C 

Literary 

83 


Librarian 

C 




Artist 

B 

Artistic 

60 

Aesthetic 30 

Physician 

C 

Scientific 

57 

Theoretical 30 

Dentist 

c 

Mechanical 

47 


Life Insurance Saleswoman 

c 

Persuasive 

88 

Economic 50 

Social Worker 

B + 

Social Service 

53 

Social 60 

English Teacher 

B + 




Social Science Teacher 

B + 




Lawyer 

A 




YWCA Scc’y 

C 



Religious 75 

Math-Sci. Teacher 

C 

Computational 

75 


Nurse 

C 




Stenographer 

B + 



Political 75 

Office Worker 

B + 

Clerical 

03 


Housewife 

C 






Musical 

49 


Berm niter Personality lnientory 

Minnesota Personality Scale 

Emotional Stability 

75th %ile 

Morale 


53rd %ilc 

Self-Sufficiency 

55 

Social Adjustment 

35 

Social Dominance 

65 

Family 

L ‘ 

20 



Emotional 1 

14 

55 



Liberalism 


80 

ad a darkroom which 

made it 

easy for her to 

pursue an interest ii 


developing and printing as well as in taking pictures. 

Miss Desmond was aware ol no clear-cut \ocational interests. She had 
always expected to go into teaching because that was what her mother 
talked about for her. She said she really did not know what she wanted 
to do, but, when asked whether her real interest might be in marriage 
rather than a career, it seemed clear that she did, for some time at least, 
want to make her own career. 

The problem, as Miss Desmond put it, was “to find out for what type 
of work I am best fitted.'’ Dissatisfied with the only two applications to 
which she thought she could put her college interest in mathematics, 
unaware of any special interests and ambitions at the moment, she 
wanted help in developing a better understanding of her abilities and 
their vocational uses. 

Exercise 6. 

a) Prepare a written analysis of the test results of this case as though for trans¬ 
mittal to her adviser. Use the sample on page 582 as a model. Do this before 
reading further, and save your report to compare with the appraisal actually 
made by the counselor. 
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b) Outline the plans which you think are most suitable for this client, includ¬ 
ing your approach to counseling the client, in the light of your psychometric 
report. Save these for comparison with the counselor’s conclusions and with the 
results of counseling. 

James L. Johnson'. A Moderately Successful Man in Search of New 
Worlds 

James Johnson was a ^(i-year-old married man, tall, well built, and 
athletic in appearance, with a dignity beyond his years. He bad been 
employed on government projects as a civilian during the war, having 
been physically disqualified from military sen ice. With the closing clown 
ol war plants he was soon to be released from this work, and wanted to 
start making definite plans for the transition back to peacetime emplov 
ment. 

Mr. Johnson had graduated from an outstanding technical institute 
with a degree in business administration, at about the time when colle ge 
graduates wete finding that the depression had made radical changes in 
the employment situation as they had understood it when the\ chose 
their major fields. He had originally planned to become an engineei, but 
had swerved from this objective because of the pessimism ol an oldei 
friend. His first job after graduation was in a factory, where he was em¬ 
ployed with the understanding that he would be trained for an executive 
position. Fie was soon made foreman in charge of a depaitment, but aftei 
some time the training program was chopped because ol the depression 
and the prospects of advancement grew slight. Although he enjoyed the 
production work, the hours were long and the temperatuie unhealthy in 
his department, so he left after a year to take a job with a retail clothing 
company. This also was for executive training, but as he could not accept 
this company’s questionable policies he resigned after several months. 
His next position was with a distributor, a family-owned concern in 
which he was given the responsibility for setting up a new department 
the operation of which, once it was established, was such a routine- 
matter, with so few outside contacts, that it bored him and left him tired 
at the end of each clay despite* its easiness. 'There being little prospec t ol 
promotion to jobs normally held by the family and its connections, Mi. 
Johnson left to become placement director of a small but wxdl-establishcd 
college. This involved a slight increase in pay, and he enjoyed the 
variety of contacts, the pleasant working conditions, and the educated 
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people he generally dealt with. When the war came he took a leave of 
absence in order to accept employment on a government project. Mere 
too he had executive responsibility, varied duties, and better pay. 

Mr. Johnson’s vocational aspirations, as he saw them, were for work as 

Figure 25. 

II ST PROFILE! JAMES L. JOHNSON 
California 7 est oj Mental Maturity , Ado. Battery 


Total I. Q. 

124 




Language I. Q. 

I 2T 




Non-Language 

118 




]Cechsler-Bellevue l 'ocabulary Test 




Full Scale I. Q. ecjuivalent 

120 



Minnesota Spatial Relations Jest 


Engineering Freshmen 

94 th %ile 

Bennett Mechanical Comprehension 




AA 



“ Job Applicants 

15th “ 


Strong 

h uder 


Strong Kuder 

Interests 

Crude 

7 c ile 


Grade %de 

Biological Silence 



Liter ary-Legal 

3 ° 

Physit urn 

c 


Lawyer 

c 

Dentist 

c 


Advertiser 

c: 

Psyc hologist 

C 


Author-J ournahst 

c 

Bhysual Si 1 erne 


1 0 

Business Contact 

95 

Engineer 

c 


Sales Managei 

B 

Mathematician 

c 


Life Insurance 


C .'hennst 

c 


Salesman 

B- 




Real Estate 


1 echnn al 


<■>6 

Salesman 

B- 

Math-Sci Teacher 

\\ 




Production Mgr. 

A 


Business Detail 





Accountant 

B+ 7 

Artistu 


54 

Purchasing Agent 

A 

Artist 

c: 


Office Worker 

A 7 

Arc lutei t 

c: 







Miscellaneous 


Social Serene 


7 2 

Musician 

C 76 

Minister 

B- 


CPA 

c 

Social Science 





Teacher 

c 




City School Supt. 

B — 




Y Physical Director 

B 




Y Secretary 

B + 




Personnel Manager 

B + 




varied and with : 

is congenial a 

clientele as those he had known as 


college staff member and government official, with a stafT to handle detail 
so that he could concentrate on policy, de\elopment, and other broade* 
matters, and pay equal to or better than his wartime salary. He could 
have returned to his college position, but discussion of this matter with 
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the college president had made it clear that there would be no possibility 
of increasing the staff of the placement office and little in the way of 
salary increases for him despite the institution’s eagerness to have him 
return. The client therefore felt that he should systematically canvass 
other possibilities, to re-establish himself in the best possible way in 
returning to normal employment. 

The problem with which this client wanted help was the appraisal of 
abilities and interests, and the analysis and evaluation of ways in which 
they might best be put to use to help him find work of an executive type, 
with congenial (educated) associates and contacts, and at a good salary 
(defined as $5000 or more per year). He realized that it might be difficult 
to find unless he made good use of contacts, but he thought that a posi¬ 
tion as an administrative assistant might give him needed experience in 
some one line or industry and put him in a position to advance to execu¬ 
tive responsibility. He was considering, in addition, selling tangibles 
such as cars or oil, especially if he could get an agency. He was not inter¬ 
ested in insurance and other intangibles. 

Exercise 7. 

a) Prepare a written analysis of the test results of this case as though for trans¬ 
mittal to his adviser. Use the sample on page 582 as a model. Do this before 
reading further, and save your report to compare with the appraisal actually 
made by the counselor. 

b) Outline the plans which you think are most suitable for this client, includ¬ 
ing your approach to counseling the client, in the light of your psychometric re¬ 
port. Save these for comparison with the counselor’s conclusions and with the 
results of counseling. 

Counselors’ Appraisals and the Immediate Results of Counseling 

In this section the interpretations of test and other data made by the 
counselors who worked with these persons will be presented, followed 
in each case with a statement of the immediate outcomes of counseling, 
that is, of the plans decided upon by the client or of the apparent status 
of his thinking wffien he left the counselor. Readers who wish to derive 
maximum value from this chapter should, before reading this section, 
have made notes on their own diagnoses and prognoses as arrived at 
while reading the preceding section. In some cases, in which the amount 
of specific detail in the case record permits and the techniques of counsel¬ 
ing are interesting, material is included to illustrate points made in 
Chapter 20. 
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Thomas Stiles: Diagnosis and Counseling (case material on p. 590 fl.) 

The Counselor's Appraisal. Tom’s intellectual level, as shown by his 
Otis I. Q. of 101 and confirmed by an A.C.E. score which put him at the 
14th percentile point of a typical college freshman class, was about 
average when compared to that of the general population. Occupational 
intelligence norms from both World Wars indicate that this is the ability 
level typical of skilled tradesmen and of the most routine clerical woik- 
ers, observation confirmed by various studies made with the Otis test in 
industry. His mastery of school skills and subjects as shown by his stoics 
on the achievement tests was about that to be expected from one of his 
mental ability level, and decidedly below that of the college fteshmen 
with whom he was compared, except for a superior score on the mathe¬ 
matics achievement test—his favorite subject. This suggested that he 
might have abilities useful in technical occupations at the skilled Ie\el 
which seemed appropriate to his mental ability. His school marks, how- 
ex er, were not so encouraging, being only IVs in mathematics prior to 
his junior year, and C’s since then. The explanation may base lain in 
his being in the more abstract college preparatory course. 

On the special aptitude tests l oin appeared to lack speed in recogniz¬ 
ing numerical and \erbal symbols such as is required of even routine 
clerical workers. Combined with his marginal intellectual abilitv toi 
office work, this strengthened the basis for questioning the choice of a 
clerical occupation. On the other hand, Tom’s scores on the tests of 
spatial visualization and mechanical aptitude seemed to confirm tin 
implications of the mathematics achievement test. Ilis inventoried in 
terests, too, were in the physical science and subprofessional technical 
fields; the latter field seemed more in keeping with his intellectual lex el 
and with his poor achivcment scores and fair grades in the natural 
sciences. 

Tom’s family background, leisure activities, and expressed vocational 
ambitions were all congruent with the implications of the test results 
His father was a semiskilled worker, indicating that work at the skilled 
level might well be accepted by the family as a step upward. Theie x\ere 
no older siblings who might ha\e established a higher record for him to 
compete with. His leisure activities were nonintellectual, but they did 
show interest and achievement in mechanical and manual activities, as 
well as familiarity with work at those levels. He stated that he wanted to 
work with engines. It was true that, under the influence of a college 
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preparatory course in the academic high school of a substantial middle- 
class community, he raised the question of going to college to study 
engineering, but in most contexts his discussions of work with engines 
were pitched at the skilled level. 

The counselor who worked with Tom therefore felt that Tom would 
be wise to aim at a skilled trade, either by means of a technical school ol 
less than college level, through apprenticeship, or through obtaining 
employment as a helper in an automotive maintenance shop and taking 
night school courses. 

The Counseling of Toni Stiles. The counselor began by asking tbe 
pupil to bring him up to date on his thinking about his postschool plans. 
Tom did this, indicating no real change in his ideas and mentioning 
college only incidentally to rule it out as an impractical objective. The 
counselor then ic\iewed the evidence of the tests and school grades, dis¬ 
cussing the intelligence test data in terms of general population pet- 
centiles and college lreshmen, but focusing mostly on their occupational 
cqui\alcnts. Family socio-economic status and the low intellectual le\el 
of the bov’s leisure interests were of course not mentioned: the accepta¬ 
bility ol skilled work to Tom and his lamih was considered something 
for him to mention, if at all, and as an attitude for the counselor to ac c ept, 
reflect, and claiify. Tom did not mention it, seeming to considei it quite 
acceptable. His leisure activities were* mentioned by the counselor as 
fitting in with the aptitude and interest data; this interpietation was 
accepted by the client with the statement that “Yes, I alwavs have thought 
T was best at mechanical things, and I like them best, too.” 

Ways of utilizing Tom’s skilled technical potentialities were taken up 
next. No decisions were leached in this interview, but two nearby tech¬ 
nical schools weie discussed, and the counselor made sine that Tom had 
access to their catalogues, one ol which was examined in order to review 
admission requirements, courses, and expenses, and to be sine that the 
student was oriented to such matters. The appientice tiaining program 
of a nearby factory, in which aircraft engines were being produced and 
increasing numbers of young men were being trained, was considered. 
Tom knew about it, and discussion helped him to plan how and when to 
apply if he decided on it; he saw the possible advantages of having such 
a specialty if he were drafted. Less formal ways of getting experience with 
automotive engines were looked into, and night schools offering appro¬ 
priate courses were mentioned. Tom was not sure what he would do, 
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when he left the office, but he felt that he knew a number of suitable 
alternatives and that he could choose between them in good time. 

Exert isc S. 

a) Compare your interpretation of the test results with that of the counselor, 
and note the ways in which they difler. Study these differences in order to locate 
possible inadequacies in your conception of the significance of the tests or scores 
in question, or ways in which your insights may be more adequate than those 
of the counselor. 

b) Compaie your tentative plans with those considered suitable by the coun¬ 
selor. Compare your proposed approach with that used by the counselor. What 
shortcomings are suggested, in your work or in that of the counselor? Evaluate 
these in the light of the client's reactions and the immediate outcomes of the 
counseling as it was done. 

Marjoiie Miller: Diagnosis and Counseling (case material on p. 592 ff.) 

The Counselor's Appiaisal. Marjorie’s scholastic aptitude tests in¬ 
dicated that she would piobabh stand in the top quarter of a typical 
college fieshman class, although they did not justifv the principal’s char¬ 
acterization of her as “brilliant." Her vocabulary and leading scores sug¬ 
gested that this characterization might be based in part upon unusual 
ability to put her aptitudes to woik, for her reading speed was decidedly 
superior to her scholastic aptitude and c\cn to her \ocabulary level. 
Marjorie was not outstanding on the social studies achievement test, in 
which subject she had had little preparation, and only moderately supe- 
iior in the natural sciences which appealed to her, but this latter mav 
ha\e been clue to not having included physics in her program. Her 
mathematics achievement was very superior. In gencial, these data wcic 
in keeping with the school grades, which we have seen to have been 
superior; but the tre nds were reversed, for her mathematics grades were 
slightly inferior to those in veibal subjects. 

Marjorie’s perceptual speed, when working with clerical symbols, was 
in the aveiage range lor clerical workers, her standing on the numbers 
test being high average and on the names test low aveiage. Her scoie on 
the test of abilitv to visualize spatial relations was only moderately high 
lor college freshmen, and would therefore not he outstanding when com¬ 
pared to scientific workers. It seemed high enough, however, to warrant 
no special consideration if other things were favorable. 
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The personality inventory scores revealed nothing of significance. He: 
interests, as measured by Strong’s Blank and the Allport-Vernon Study 
of Values, seemed to be concentrated in the scientific field, with some 
signs of interest in the social and religious fields. They were quite femi¬ 
nine, and although she did not have much in common with office workers 
who tend as a gioup to make high housewife scores, her interests did 
resemble those of housewives. It seemed worth noting that her highest 
scientific interest score was as nurse, which hardly belongs in that group 
and which is heavily saturated with the interest factor which is most 
important in housewives. She had stated that she thought she might 
eventually marry, but this thought seemed to play no part in her voca¬ 
tional planning. 

Marjorie’s school and leisure-time interests did not do much to weight 
the balance in the direction of either scientific or social interests. Her 
favorite school subjects included chemistry, languages, and history. Hci 
activities encompassed not only photography, but also the school paper, 
scout leadership, drama, and painting. 

Marjorie’s expressed ambitions were in the direction of natural sci 
ences, in keeping with her measured interests, tested achievement, and 
some aspects of her school and recreational record. The counselor was 
inclined to give more weight to these factors than to the secondary inter¬ 
est pattern in social welfare work, the social welfare and literary activ¬ 
ities, and the achievement in verbal subjects in school. He concluded 
that Marjorie would be wise to go to a liberal arts college where she 
would still have opportunity to explore both the social welfare and the 
scientific fields in courses and in activities. He thought that it would be 
well for her to select a college which had strong offerings in the natural 
sciences so that if she did choose this field she would be able to pieparc 
for it as well as her abilities and drive warranted. It was the counselor’s 
opinion that Marjorie would probably become a medical laboraton 
technician, dietician, or high school teacher of science. 

The Counseling of Marjorie Miller. Like that of Tom Stiles, the 
counseling of Marjorie Miller was done in a situation in which one or at 
most two interview’s were customary, case material being worked up 
ahead of time and discussed in a factual manner with the student. Din¬ 
ing and especially after the review of the data by the counselor, in terms 
of their actuarial significance for educational and vocational choices, the* 
pupil had opportunity to react to them and to discuss them. The conn 
selor then attempted to help the client understand his reactions, see the 
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implications of the data, and consider possible lines of action. He drew 
on whatever informational resources were needed and available in order 
to help with the pupil’s orientation. In Marjorie’s case the leview of the 
data seemed to bring out little that she did not already realize, although 
the objective and actuarial form in which they were presented impressed 
her, as might a view of oneself in a mirror lor the first time in one’s life. 
The possibility of keeping her program broad lor the first two years of 
college and still finishing with a vocationally usable major had not been 
known to her; when this was mentioned, she suggested that she might 
then do well to continue to exploie both the scientific and the social 
fields before making a decision. 

Marjorie then raised the question of which c(allege, as those she had 
been thinking of, rather vaguely, did not offer genuine liberal arts pro¬ 
grams but instead speciali/ed from the heshman year. The counselor 
mentioned seveial colleges of the type which he thought might be appio- 
piiate to her, and asked Marjoiie if she had ever thought of any of them. 
Finances appeared to be a problem. The counselor had made note of 
some scholarships for which students might possibly apply, one of them 
being a very desirable scholarship offered by a first-rate college and 
limited to girls from her pai t of the state. Marjorie wondered whether 
she could qualify for such a prize. The counselor, knowing the standing 
of some girls who had pieviously been awaided it, said lie thought she 
might and encouraged her to apply She decided to do so, although she 
could not afford to go to that college without generous financial aid. 
There was some discussion, also, of ways in which campus activities, 
courses, and summer vacations could be utilized by Marjorie to get a 
better idea of the direction in which she wanted to turn when she came 
to the fork in the road. 

Exercise <). 

a) Compare your interpretation of the test results with that of the counselor, 
and note the ways in which they clifler. Study these differences in order to locate 
possible inadequacies in your conception of the significance of the tests or scores 
in question, or wavs in which your insights may be more adequate than those 
of the counselor. 

b) Compare your tentative plans with those considered suitable by the coun¬ 
selor. Compare your proposed approach with that used by the counselor. What 
shortcomings are suggested, in your work or in that of the counselor? Lxaluatc 
these in the light of the client’s reactions and the immediate outcomes of the 
counseling as it was done. 
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Ralph Sheridan : Diagnosis and Counseling (case material on p. 595 fl.) 

The Counselor's Appraisal. The psychometric data indicated that 
Ralph was indeed college caliber, having more scholastic aptitude than 
about 90 out of too college freshmen. His vocabulary and reading ability 
were on a par with his promise, perhaps superior to it. His mastery ol 
school subjects as measured by the achievement tests was also superior, 
his greatest strength being in the social studies, with mathematics and 
natural sciences significantly lower but also better than those of most 
college freshmen. This agreed with his high school grades as to level 
of accomplishment in general, but revealed differences in mastery which 
were greatei than the slight trends shown in his grades. 

Ralph’s ability to percei\e numerical symbols quickly and accurately 
was low when compared to that of clerical workers, but his perception of 
verbal svmbols was superior. In ability to judge shapes and sizes Ralph 
equaled the tvpical college freshman but showed no special promise in 
comparison with freshman engineers. In understanding of the uses of 
the tools and materials of mechanical and related work, he was superior 
to the typical skilled worker. "This is not surprising in one who led an 
outdoor life; his preference for woodcraft activities rather than mechani¬ 
cal may be related to the lack of a high degree of spatial comprehension 
and might lead one not to emphasize the mechanical aptitude score. 

The somewhat low adjustment scores fit in with the solitary and small- 
group leisme-time pattern, but ate not low enough to give cause lot 
c oncern. 

Ralph’s interests, as assessed by Strong’s group scales and a few supple 
mentary individual keys, most icsembled those of men in business contact 
occupations such as life insurance sales. They resembled somewhat those 
of men in engineering occupations, clerical and accounting woik, and 
the literary-legal fields, but less so. They were rather like those of farmers, 
and may be presumed to have been rather like those of forest service 
men. These interest scores give some support to his expressed desire to 
study engineering, but, combined with his lack of technical hobbies, 
superior mastery of the social studies, unusual verbal ability, and prefer¬ 
ence for English, historical fiction, and adventure stories, give even 11101 c* 
reason for questioning the choice of engineering. The counselor believed 
that Ralph might graduate from engineering school, despite the fact that 
many like him chop out or change fields; he doubted very much whcthei 
Ralph would use engineering training in earning a living. His solitaiy 
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interests, however, made business contact work seem unlikely to prove 
satisfying. If Ralph was set on engineering, it seemed wise to suggest that 
he consider industrial engineering, production management, and similar 
activities rather than the more technical aspects of the work. Business 
administration might be a better major. Forestry seemed like another 
possibility which might give him a combination of the things which 
interested him. 

The Counseling of Ralph Sheridan. The data were reviewed with 
Ralph as they had been with Tom and Marjorie, on an information¬ 
sharing basis. He saw the reasons for questioning his choice of engineer¬ 
ing, but felt that he still wanted engineering training. He believed that 
as a civil engineer he might be concerned mostly with the management of 
production or construction work, and that this would be in line with his 
scmitcchnical interests. He expressed an interest in exploring nontechni¬ 
cal uses of engineering training as he progressed in his training. The 
counselor felt that he had a good grasp ol the situation, and that better 
insight might develop later. The rest of the interview was devoted to 
places at which Ralph might obtain the desired training and ways of 
financing it, problems with which we are not here concerned. 

Exercise jo. 

a) Compare your interpretation of the test results with that of the counselor, 
and note the ways in which they ciifler. Study these differences in order to locate 
possible inadequacies in your conception of the significance of the tests or scores 
in question, or ways in which your insights may be more adequate than those 
ol the counselor. 

b) Compare your tentative plans with those considered suitable by the coun¬ 
selor. Compare your proposed approach with that used by the counselor. What 
shortcomings are suggested, in your work or in that of the counselor? Evaluate 
these in the light of the client’s reactions and the immediate outcomes of the 
counseling as it was done. 

Paul Manuelli : Diagnosis and Counseling (case material on p. 597 ff.) 

The Counselors Appraisal Paul’s scholastic aptitude tests confirmed 
the opinion that the earlier I. Q. did not truly represent his mental level. 
Compared to college freshmen he seemed very superior, ranking in the 
top 15 percent. His vocabulary and reading speed were even higher. 
The achievement tests showed that he was unusually well prepared in 
mathematics, superior in social studies, but not well prepared in the 
natural sciences. This seemed surprising, as he had received an 85 in 
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chemistry the preceding year, and an 85 in the first semester of physics 
at the time of testing; as his grades in the linguistic subjects were gen¬ 
erally slightly superior to those in the quantitative, this may actually 
reflect true differences in his special abilities. 

In ability to distinguish numerical symbols with speed and accuracy 
Paul ranked low when compared to clerical workers, but his facility with 
verbal symbols was average for such workers. His understanding of the 
nature and uses of mechanical, woodworking, and related tools and 
processes was average when compaicd with that of skilled workers, but his 
ability to visualize and mentally manipulate objects in space was quite 
inferior when compared with that of college freshmen. Despite the high 
achievement in mathematics, this poor showing in spatial visualization 
and relatively low standing in natural sciences, combined with high 
verbal ability and superior social studies achievement, appeared to lend 
some support to this student's second expressed interest: law. His very 
superior measured adjustment agreed with the opinions of the school 
staff. 

Paul’s inventoried interests were most like those of physical scientists, 
including engineers, and also resembled those of men in business contact 
work such as life insurance sales. He showed some interest also in the 
biological science and literary-legal occupations, but not as much as 
might have been expected of a boy who enjoyed public speaking and 
debating, had little in the way of technical hobbies, and was considering 
law as seriously as engineering. 

The counselor was inclined to give more weight to the factors pointing 
in the direction of engineering than to those contradicting it. Mathe¬ 
matical ability and interest supported the choice, while poor achievement 
in science and poor spatial visualization opposed it. It seemed possible 
that the spatial relations score was for some reason not representative, 
and the poor science preparation might have been more a matter of 
teachers than of pupil. It hardly seemed justifiable to question seriously 
the choice if Paul made it after a review of the data. 

The Counseling of Paul Manuelli. In view of the above, the counselor 
let Paul talk some about his vocational objectives. These seemed more 
than ever to involve engineering training, but Paul wanted to know how 
he compared with engineering freshmen. His profile was therefore re¬ 
viewed. Paul reacted particularly to the relatively high mathematics 
standing, to the low Paper Form Board score, and to the greater degree 
of interest in physical science than in legal occupations. He was not 
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inclined to be discouraged by the low spatial score, and might perhaps 
have given it no more thought, but the counselor suggested that it might 
be taken as a warning signal, and that if mechanical drawing or other 
such activities ever gave him trouble he might want to look into it. fur¬ 
ther; it was mentioned also that his standing might be checked by a 
retest. College choices were then discussed, the client raising the question 
aftei indicating that he thought he would go ahead with engineering. 
Paul felt that his financial status might make choice of a co-operative 
training program wiser than a four-year engineering school. The nearby 
engineering schools operating on the co-operative system were therefore 
considered with the aid of catalogues, and one was most thoroughly dis¬ 
cussed as being accessible, inexpensive, and of good standing. 

Exercise n. 

a) Compare your interpretation of the test results with that of the counselor, 
and note the ways in which they differ. Study these differences in order to locate 
possible inadequacies in your conception of the significance of the tests or scores 
in question, or ways in which your insights may be more adequate than those 
of the counselor. 

b) Compare your tentative plans with those considered suitable by the coun¬ 
selor. Compare your proposed approach with that used by the counselor. What 
shortcomings are suggested, in your work or in that of the counselor? Evaluate 
these in the light of the client’s reactions and the immediate outcomes of the 
counseling as it was done. 

James G. Revere: Diagnosis and Counseling (case material on p. 599 ff.) 

The Counselor's Appraisal. Mr. Revcre’s intellectual ability, as shown 
by an individual and a group test of mental ability, was decidedly supe¬ 
rior, both tests placing him above the 95th percentile. A test of spe¬ 
cialized vocabulary revealed that he had unusual verbal ability in a 
variety of fields. Although none of the norms for this test were strictly 
appropriate to this client, who was much older and more experienced 
than the high school seniors and less trained in business than the college 
business seniors to whom he was compared, the indications were that he 
was well informed in all fields except the biological sciences. 

His speed in handling clerical symbols was poor for numbers and 
average for names, when compared with clerical workers. His manual 
dexterity was fair, as was his ability to judge shapes and sizes and men¬ 
tally to manipulate them. His knowledge of mechanical tools and proo 
esses, and his ability to comprehend mechanical principles and opera- 
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tions were very superior when compared with those of persons with some 
experience and interest in those fields. 

The client’s interests, according to Strong’s Blank, were most similar 
to those of men engaged in scientific, subprofessional technical, and 
business detail occupations. It was notable that none of these patterns 
were clear cut, each involving some B— or C+ scores, and only one, the 
business detail group, including an A (purchasing agent). At the same 
time, there were scattered B’s in several other fields, including business 
contact and literary-legal occupations. The Ruder pointed these findings 
up by yielding a low scientific interest score, with a high mechanical, 
even higher scores were made, however, in computational, clerical, and 
persuasive interests. The high artistic and substantial literary scoies on 
the Ruder were discounted as lay interests, as the\ were not appreciably 
reflected in the Strong or in the client’s leisure acti\ities. 

The test results generally appealed to fall into no more clear-cut a 
pattern than did the interest imentories, but some study of them sug¬ 
gested a few conclusions of significance. The combination of interest in 
mechanical or subprofessional technical woik with inteiest in peisuasiw 
activities and mechanical aptitude and fair spatial visualization suggested 
that sales and service in the mechanical field might provide a suitable 
outlet for the client. Contraindicating this, however, seemed to be the 
client’s intellectual le\el and his desire for status. 

These same low-level or attenuated technical inteiests and abilities 
being also characteristic of pioduction manage!s and industrial engi¬ 
neers, who aie characterized b\ a k:\el of mental ability more neailv 
resembling that of the client, it seemed to this counselor that work ol 
this type might provide Mr. Revele with activities of a more satisfying 
type accompanied by status more in keeping with his desires. 

While the testing was going on, however, the counselor had a number 
of interviews, extending over a period of three months, with the client 
These interviews were focused at times on types of work and ways o( 
getting into them, but at other times on the client’s feelings of insecurity. 
These were brought out only incidentally in the second or third contact, 
although the counselor had suspected their existence in the first inter¬ 
view; as counseling progressed they came to the surface more often, and 
were recognized as a fundamental part of the client’s adjustment prob¬ 
lem. 

The counselor’s diagnostic formulation of the case was as follows: Mr. 
Revere was faced witfi a ver\ real ptoblem which is not uncommon 
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among clients in the late twenties and early thirties. It was that of an 
intelligent, maturing man who had not developed or achieved any clear- 
cut occupational goal and was becoming concerned about it. He evi¬ 
dently realized that he needed to take steps of some sort in order to 
derive greater satisfaction from his work. Having held only one real job, 
and feeling that he knew little about occupational opportunities, he 
sought the help of the counselor. His feelings of insecurity (unrecognized 
at first), together with his overconfidence in the prescriptive power of 
tests, caused him to expect the counselor and tests to solve his problems 
for him. He was therefore leluctant to make his own decisions and take 
the responsibility lor his own actions. 

The Counseling of fames Re acre. The counselor believed that there 
weie two impoitant objectives in his work with Mr. Revere. One was to 
help him to clarify his objectives, \alues, and interests by discussing these 
at length in a permissive and insight-producing situation. The other was 
to help him accept, understand, and o\crcomc his feelings of insecurity, 
by letting him talk about them and discover ways of handling them. As 
the client had come to the counselor with vocational aspects of his adjust¬ 
ment as his leason, and as the objective data indicated that he had some 
reason for feeling misplaced, it was felt that the best way to get at values, 
life goals and feelings of insecurity, was through a discussion of the 
client's vocational problems. From the beginning the client felt strongly 
that tests would help him, despite an attempt to play them down during 
the fust two interviews; it was therefore decided that testing might be a 
help in keeping some kiird of rapport, and that the counselor’s skill 
would be most effectively used in helping the client to see the importance 
of other factors after rather than belore testing. 

The fust four interviews were therefore devoted to discussions of voca¬ 
tional goals and opportunities and of the client’s feelings of insecurity, 
and to test administration and interpretation, the testing being clone by 
the counselor as part of an interview. By the fifth interview the client 
began to show some interest in following up an old objective, sales and 
service work with business machines. He felt that this would pay well 
and use both aspects of his previous training. He felt, however, that his 
feelings of insecurity in relations with other people would be too great 
a handicap. He showed considerable dependence on test scores, on the 
counselor, and on two friends whom he considered successful and well 
informed. The sixth and seventh interviews were devoted to discussion 
of the somewhat low clerical perception, manual, and spatial scores, 
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which rather disturbed the client, and to exploration of possibilities in 
mechanical and advertising fields. This exploration was almost entirely 
a matter of discussion in which the counselor quizzed the client in order 
to get him to tap his own sources of information, or supplied information 
himself, and in which he reflected the client's feelings in such a way as 
to help clarify his attitudes toward the opportunities under discussion. 

The turning point at which Mr. Revere began to assume some respon¬ 
sibility for solving his problem himself, a point long recognized as crucial 
in psychotherapy but generally unrecognized in vocational counseling, 
came in the eighth interview. By this time the client had evidently 
reached the point at which he perceived that tests had helped him as 
much as they could. They had shown him certain unsuspected weaknesses 
and some half-suspected strengths, but they had not solved any problems. 
He knew more about himself, but the decisions concerning himself still 
had to be made. The counselor had consistently refused to make them for 
him, not by saying in so many wmds that he must live his envn life, but 
by discussing problems and clarifying feeling in such a way as to lea\e 
the responsibility for the last word ahvays in the client’s hands. 

During the ninth interview' Mr. Revere showed some discomfort and 
wandered considerably, not liking the fact that mechanical aptitude, 
which might mean beginning again at the bottom of an occupational 
ladder, seemed his principal asset. But he assumed the lcsponsibility lor 
concluding that if he w T as to get beyond the point which he had reached 
in clerical w r ork he wotdd probably have to change fields, and he made 
the decision that it should be something mechanical, lie felt that sales 
and service would be the logical combination, as he had some contacts 
which might help him get started, and it would not mean trying to get 
the college engineering training that he Jacked. In the next interview 
he brought to the surface the feeling that his emotional insecurity w r as 
the big stumbling block in sales work. This fact had been mentioned 
before, but this time he began to examine its foundations. He vacillated 
between the opinion that he could sell if he tried, and that he was so 
afraid of people that he would not be aggressive enough. By the 11 th 
interview his defenses were dowm, for he then clearly realized that the 
only real obstacle to doing what he felt he should do was his own per¬ 
sonality problem. The counselor was accepting, recognized the nature of 
the dilemma, and discussed it wdth the client, but did nothing directlv 
to resolve the issue. 
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By the 12th interview the client seemed to have worked things through 
himself, helped by the impetus gained in discussion with the counselor. 
He discussed his proposed plans with the counselor. They included some 
refresher work on business machines, to strengthen his case in applying 
for a sales job. Then he planned to talk with some of his contacts, to see 
what further orientation they could give him, especially in the way of 
places and persons to which he might apply. Then he planned to take 
time olF from his job in order to carry out a thorough-going job-seeking 
campaign. He would look lor a job in which he might round out his 
training in the use and maintenance of business machines before under¬ 
taking to sell in the field. He still manifested some doubt as to his sales 
ability, and this was gone into again. It came out that his fears had to do 
with initial contacts; he saw that in work such as this he would in due 
course reach the point at which there was not too much new-contact 
woik. And he was sure enough ol his relations with people he knew to 
be confident that he would do well with regular customers, and even 
with new customers who were not seen without previous cultivation. This 
seemed correct to the counselor, as the case history material showed him 
to be a likable and liked voung man. 

The counselor had thought that this interview might close with a 
decision on the part of the client to undertake psychotherapy, as a neces¬ 
sary preliminary to vocational adjustment. The counselor was therefore 
prepared to handle the transition to another counselor. It seemed, how¬ 
ever, that the client had taken things into his own hands, and assumed 
responsibility for his own acts. Apparently he was not sufficiently un 
comfoi table about his fears of meeting people to want to explore that 
matter any further than he already had, with the counselor’s help, in the 
interviews on vocational adjustment. The case was therefore closed bv 
mutual consent and on the initiative of the client, after twelve contacts 
involving interviewing and testing. 

Exercise 12. 

a) Compare your interpretation of the test results with that of the counselor, 
and note the ways in which they differ. Study these differences in order to locate 
possible inadequacies in your conception of the significance of the tests or scores 
in question, or ways in which your insights may be more adequate than those 
of the counselor. 

b) Compare your tentative plans with those considered suitable by the coun¬ 
selor. Compare your proposed approach with that used b\ the counselor. What 
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shortcomings arc suggested, in your work or in that of the counselor? Evaluate 
these in the light of the client’s reactions and the immediate outcomes of the 
counseling as it was done. 

Ruth Ann Desmond: Diagnosis and Counseling (case material on p. 

Ooi ff.) 

The Counselor’s Appraisal. Miss Desmond exceeded 79 out of 100 
college freshmen on the A.C.E. Psychological Examination, with almost 
equal scores on the linguistic and quantitative subtests. As she was se\eral 
years above college freshman age and scores are influenced by those years 
her actual ability was probably not as superior as this test suggests, but 
in any case she was clearly of superior mental ability. On the general 
culture test the client’s highest scores were in current social problems 
and in mathematics, both of these being in the top decile. However, her 
knowledge of science, social studies, and literature weie each almost as 
high, and her familiarity with the fine arts was also better than average. 
All that could be concluded from this test was that the client was a well- 
informed young woman; it helped little if at all with differential diag¬ 
nosis. 

Miss Desmond’s ability to perceive numerical and verbal symbols was 
only mediocre when compared with employed clerical workers, which 
may have had something to do with her dislike of the junioi accountant’s 
work. Her manual dexterity in single-handed operations was superior to 
that of women industrial employees; in two-handed operations she was 
about average. Eler speed in dotting and tapping, operations of a manual 
dexterity type which are important in mechanical and office jobs such 
as machine bookkeeping woik, was very superior. She was superior also 
in her ability to comprehend mechanical principles and apply them to 
operations. Her ability to judge shapes and sizes, however, was low 
average when compared to the general population. 

The client’s interests, as measured by both the Kuder and the Strong, 
were very much like those of women engaged in social work, teaching 
social studies and English, and office work. These were supported by the 
Allport-Vcrnon Study of Values, which also brought out considerable 
interest in status and prestige. The Bernreuter and Minnesota person¬ 
ality scales agreed in describing Miss Desmond as emotionally stable, the 
former adding self-sufficiency and social dominance, and the latter point¬ 
ing up poor family relations; observation of the client led the counselor 
to consider the Bernreuter scores indicative of compensatory attitudes 
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and behavior, and the Minnesota scores more truly suggestive of under¬ 
lying attitudes. 

The counselor felt that Miss Desmond’s assets consisted of her superior 
mental ability, drive, superior general information, manual dexterity, 
and mechanical comprehension. Her clerical perception seemed good 
enough to be usable as a means to something else, even though it hardly 
seemed likely to make lor success in clerical work as such. The problem 
seemed to be one ol finding ways in which these abilities could be used 
which would be congruent with her social and office work interests. Two 
possibilities occurred to the counselor: i) statistical machine work, in 
which the client’s mechanical comprehension, manual dexterity, and 
mathematical ability could be combined with office work interests, per¬ 
haps ol a supervisory nature which would pro\ide outlets for her interest 
in dealing with people; 2) secretarial work, for which also she had the 
necessary training, in which her mediocre clerical aptitude would be 
more than compensated lor by her intelligence, interest in human rela¬ 
tions, and, pel haps, ability to assume responsibility as an administrative 
assistant or junior executive. 

The Counseling of Ruth /Inn Desmond. In discussions with this 
client the locus was at first on the reasons for dissatisfaction in her 
ptevious employment. Testing was done in a supplementary way con¬ 
currently with the intei\iews. by another person, it being made clear 
that the* counselor thought of them merely as another way of getting 
some information which might be useful. When the test results weie 
available the counselor explained their psychological and occupational 
significance, leaving time for Miss Desmond to express her attitudes and 
feelings as he did so. 

The client suggested that perhaps getting a position as a secretary in 
an office in which she might have a variety of responsibilities, including 
contact with the public or supervision of others, and rise to more execu¬ 
tive types ol responsibility, might be one outlet for her. The counselor 
reflected the leeling that this might be a good type of opportunity, and 
it was discussed further. The counselor then asked if Miss Desmond had 
ever thought of work with statistical machines, and found that she 
knew little about the opportunities in that field. These were therefore 
outlined by the counselor. The client left with the intention of exploring 
both fields. 

At the next interview she reported that she had been offered several 
stenographic jobs, one of them being in a law concern with a large and 
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varied practice. She believed it offered some possibilities, and said she 
might take it. Her thoughts seemed to be almost entirely on this matter, 
and the interview produced little else. The next report, by telephone, 
indicated that she had taken the law concern job and was beginning 
work. 

Although this case came as one of vocational counseling, with an 
immediate problem of vocational choice which caused both counselor 
and client to focus on pertinent attitudes, aptitudes, and interests, the 
counselor was not quite satisfied with his work. It was true that the 
diagnostic picture was not clear, and that despite this a coherent picture 
of abilities and interests had been constructed which made psychological 
and occupational sense, that the immediate outcome had been the 
making and launching of a vocational plan in keeping with this picture, 
and that all of this suggested effective work along appropriate lines. 
Despite this the counselor felt uncomfortable about the case. He won¬ 
dered whether it might not be that Miss Desmond really needed help 
with a problem ol personal adjustment, but had been unable to ask for 
such help or even to take it when the counselor asked rather directly 
about her ideas concerning marriage. It seemed possible that she might 
even have taken the law office job as a means of putting an end to the 
counseling relationship, in which she might soon have approached her 
personal problem. If so, the counselor wondered, might he so have 
handled things earlier in the relationship as to have avoided such a 
break? To have probed and rushed right into the problem would hardly 
have improved things. But a focus on attitudes and values rather than 
on vocational interests and aptitudes might have led more rapidly 
to the development of a relationship which could have withstood the 
strain of the uncovering of emotional problems. Only a follow-up and 
the subsequent history of the case could tell, and even it might be as 
unconclusive as many of its other features were. 

Exercise 13. 

a) Compare your interpretation of the test results with that of the counselor, 
and note the ways in which they differ. Study these differences in order to locate 
possible inadequacies in your conception of the significance of the tests or scores 
in question, or ways in which your insights may be more adequate than those 
of the counselor. 

b) Compare your tentative plans with those considered suitable by the coun¬ 
selor. Compare your proposed approach with that used bv the counselor. What 
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shortcomings are suggested, in your work or in that of the counselor? Evaluate 
these in the light of the client’s reactions and the immediate outcomes of the 
counseling as it was done. 

jarnes L. Johnson : Diagnosis and Counseling (case material on p. 604 ff.) 

The Counselor s Diagnosis. Mr. Johnson was given two tests of mental 
ability, the Vocabulary Test of the Wechslcr-Bellevue and the California 
Test of Mental Maturity. On the former his intelligence quotient was 
120; on the latter his total I. Q. was 122, with language and nonlanguage 
I. Q.’s of 123 and 118 respectively. The evidence therefore agreed in 
showing him to be a man of superior mental ability, quite capable of 
performing successfully in professional or executive work. 

In ability to visualize the relations of objects of different shapes and 
sizes Mr. Johnson exceeded even the majority ol engineering students, 
standing at the 94th percentile on the Minnesota Spatial Relations 
l est. In ability to understand the operation of mechanical contrivances 
and to apply mechancial piinciples to practical situations he did not. 
however, compare well with graduate engineers, his score being at the 
15th percentile lor this group. Although scores 011 this test arc somewhat 
affected by experience its effect is not very great, and in any case the client 
had had experience which had given him opportunity to increase 
his familiarity with mechanical matters and to apply his spatial visu¬ 
alization ability to mechanical problems. 

Interests were measured by the Strong and Ruder inventories. They 
agreed in revealing a high degree of interest in subprofessional technical 
occupations such as production manager, occupations which provide 
outlets for mechanical interests but do not require a high level of 
mechanical ability or of interest in scientific matters. The Strong Blank 
showed some resemblance between Mr. Johnson’s interests and those ol 
successful salesmen and sales managers, but not as much interest in sales 
work as was suggested by the very high persuasive score on the Kiulei 
Record. This seemed to be related to the client’s statement that he was 
interested in promotional activities, but did not like actual selling. 
Strong’s Blank showed considerable similarity of interest with those of 
men employed in business detail work, including accountants and pur¬ 
chasing agents, but the Ruder yielded very low scores on the clerical and 
computational scales. Apparently the client had interests like those of 
office workers but, as he himself stated, did not enjoy clerical routine 
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once it was established. Both inventories revealed some interest in social 

welfare work, but this seemed secondary to business and production 

interests. 

An attempt at personality appraisal was made by means of the Bern- 
rcuter Personality Inventory. This depicted the client as quite unstable 
emotionally, dependent, introverted, moderately dominant in face-to-face 
situations, quite self-conscious, and somewhat solitary. These results 
seemed to agree with interview material which suggested an underlying 
neurotic tendency in Mr. Johnson. This material consisted of his de¬ 
scription of himself as a worrier, and of a possible interpretation of his 
vocational dissatisfactions as due to personality maladjustment rather 
than to vocational misplacement. These maladaptive tendencies seemed, 
however, to be well under control, as evidenced by Mr. Johnson’s suc¬ 
cess in each of his jobs, his employers’ desires to have him stay with them, 
and the fact that each change of employment so far had been for a 
definitely superior position. Although ii seemed to the counselor that 
the client might lie paying to high a price emotionally lor his success, 
the fact that he did not take advantage of the rathet permissive coun¬ 
seling relationship to work on personality pioblems led the counselor 
not to press him to open up that area. 

In summary, it seemed that Mr. Johnson was a man of superior general 
mental ability capable ol achieving, as he actually had, at the professional 
and executive level. His low mechanical comprehension and scientific 
interests indicated that he had perhaps done well to avoid engineering 
occupations, although he did have the spatial aptitude and lower tech¬ 
nical interests which might make industrial work appeal to him. This 
probably explained the satisfaction which he found in the factory job 
which he held after graduation, despite the poor working conditions and 
lack of advancement which caused him to leave it. The combination of 
business, technical, and welfare interests shown by the inventories, com¬ 
bined with his own stated preferences, indicated that he should find 
satisfaction in ofhee work of a supervisory nature, in which he did no 
detail woik but was responsible primarily for planning and for outside 
contacts. It was felt that he might have difficulty adjusting to the emo¬ 
tional demands of some jobs, but his success in his previous positions and 
in moving to progressively better jobs led to the conclusion that under 
favorable conditions he would be able to make the necessary adjustments. 
Psychotherapeutic help might enable him to get more satisfaction from 
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his work and from other aspects of life by relieving him of the load of 
anxiety which it was suspected that he carried, but it might be more 
appropriate for him to seek such help after he had made the change 
back to a peacetime job than during the transition period. 

The Counseling of James L. Johnson. In this case the counseling 
procedure consisted of the initial interview for the collection of case 
material and the determination of the problem to be worked on, followed 
by the administration of the tests the results of which have just been 
summarized. Then followed an interview for the discussion of the im¬ 
plications of the test results and of the meaning of the client’s expeii- 
ences to date. These were followed by three more widely spaced 
interviews, in which job-seeking plans were thought through, ielated 
activities were reported and evaluated, and the suitability of openings 
discovered was discussed. 

In the first interview following testing, Mr. Johnson’s test data were 
inteipieted as favoring employment in fields such as production manage¬ 
ment, personnel work, buying, and general administrative work such as 
he was contemplating. It was suggested that sales work did not seem 
indicated, and that his former engineering interest might have proved 
to be an unwise choice had it been followed through. After this rather 
directive interpretation by the counselor the discussion shifted to a re¬ 
view of the client’s experiences in the light of the test results, and of the 
test ie.suIts in the light of the client’s experience. In this process the 
counselor was relatively nondirective, reflecting the feelings and atti¬ 
tudes expressed by the client, and occasionally asking a question designed 
to assist the client in his thinking. When, for example, contemplation 
of his low clerical score on the Ruder and the moderately high clerical 
score on the Strong caused the client to remark: “So I don’t like clerical 
work but I do have interests somewhat like those of office workers,’’ the 
counselor asked, “What meaning does that have for you?’’ This led 
the client to state that he supposed he would find working with that 
kind of people congenial, but that he would want to have duties other 
than lesponsibility for clerical detail. The question, “What kinds of 
jobs might offer you that combination?” led to an exploration of super¬ 
visory and public contact jobs in business. Further discussion led the 
client to conclude that, everything considered, the position of adminis¬ 
trative assistant would probably offer him the best chance to do congenial 
work and to learn enough about some type of enterprise to enable him 
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to assume executive responsibilities. Other supervisory and contact jobs 
did not seem equally open because of lack of specific experience other 
than clerical and educational. 

In the next interview the means through which the client might locate 
suitable openings were explored. He revealed a good orientation to job- 
seeking methods, so the discussion was primarily an opportunity for him 
to use the counselor as a sounding board for his own analysis of each 
lead, of the best way in which to use it, and of the suitability of the kinds 
of jobs which it might yield. 

Subsequent interviews were devoted to discussion of the openings 
which the client located during his job hunting. One of these was in 
personnel work with an oil company; others were: accountant with 
supervision of an accounting department for an important foundation, 
industrial relations work with a rubber manufacturer, industrial engi¬ 
neering in an electrical equipment factory, and administrative assistant 
to the head of a large business enterprise. The two personnel positions 
had considerable appeal, but both involved certain limiting conditions 
which made the client hesitate, one geographic and the other the nar¬ 
rowness of the job because of the specialization in the office. The ac¬ 
counting job was with an organization which would have provided very 
pleasant working conditions and good pay, but the client knew that he 
would have to relearn a great deal about that type of work and that the 
work itself would not appeal to him. The industrial engineering job 
would have involved beginning rather low in the scale and working up, 
and at his age and with his experience the client did not feel he should 
make such adjustments. The position of administrative assistant had the 
most appeal, for it not only paid well, but was described as one which 
had, for some previous incumbents, led to higher-level executive positions 
in this and in other companies. Mr. Johnson was quite enthusiastic about 
this possibility, and the counselor felt that it was compatible with his 
interests and abilities. The client stated that if the offer materialized he 
would accept it. 

Exercise 14. 

a) Compare your interpretation of the test results with that of the counselor, 
and note the ways in which they differ. Study these differences in order to locate 
possible inadequacies in your conception of the significance of the tests or scores 
in question, or ways in which your insights may be more adequate than those 
of the counselor. 

b) Compare your tentative plans with those considered suitable by the conn- 
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selor. Compare your proposed approach with that used by the counselor. What 
shortcomings are suggested, in your work or in that of the counselor? Evaluate 
these in the light of the client’s reactions and the immediate outcomes of the 
counseling as it was done. 



CHAPTER XXIV 

ILLUSTRATIVE CASES: FOLLOW-UP 
AND EVALUATION 


The Validity of Vocational Appraisals in mi Tic.ht 
or Subsequent Work His torus 

I.ACM of the sc\cn eases discussed in the preceding chapter was followed 
up some time after counseling in older to find out in what t\pe of woi k 
he was engaged, how well he liked it, what aspects of it he disliked, and 
how well the ultimate outcomes of counseling agieed with the apprais¬ 
als made b\ the counselors. The time that elapsed between tlu* closing of 
the case and the lollow-up \aried greatly. In one case it was only tlnce 
months, as the case was handled not long before this chapter was wiitten. 
In one it was ir, months. In several it was two \eais. In some it was six 
\ears, and in some it was e\en longer. 1 he cases weie, with one exception, 
selected paitly because enough time had elapsed since counseling to make 
follow-up meaningful. 

In one case the follow-up was through personal contacts which, by a 
happy combination of circumstances, were lenewed bom time to time 
over a period of several \ears. In certain others it was made through 
coriespondcncc supplemented by the personal contacts of others living 
in the same communities. And in still others the* follow-up consisted 
solely of a brief exchange of letteis. Such methods lease a good deal to 
be desired, as they are not likely to yield emotionally toned material and 
provide insufficient oppoitunity ior the exploration of important issues. 
Their results are, howe\er, given for the insights which they do occa¬ 
sionally give into the adecjuacy of the understandings derived from the 
tests and related diagnostic procedures, inadequate though they may be, 
the obtaining of even these follow-up data represents an advance over 
much that is done in the way of test evaluation. It is in the intensive 
follow-up and evaluation of clients’ adjustment that the greatest advances 
still remain to be made. 

The follow-up data for each of our seven cases are presented in the 
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paragraphs which follow, accompanied by comments on the adequacy 
of the testing and ol the appraisal in the light of these data. 

The Early Career of Thomas Stiles (sec also p. 590 and p. 607) 

Subsequent History. Lorn was followed up by means ol a personal 
letter some eight years alter he was tested and counseled. His let lei was 
brief and factual, giving an outline of his experiences dining the inter¬ 
vening years but not going into detail concerning his attitudes toward 
these cxpeiicnces. Alter graduation horn high school lie took a summci 
job, comparable to those he had held in previous years. He was then 
admitted to appienticeship training in a huge metal-products manulac- 
turing concern, lemaining for six months befoic illness caused him to 
resign and return home. Several months later he accepted employment 
as a tool grinder with a company within commuting distance ol his 
home, where he worked lor a peiiod ol two yeais. He was successful at 
this work, but felt the- need lot more training ol the type that had been 
interrupted b\ Ins illness. He therefore gave up this job and entered 
one of the subprolessional technical schools which he had discussed with 
the counselor three and one-hall years previously, taking a two-year 
course in steam and diesel engineering. He graduated alter the normal 
two years, having enjoved the training and worked during the one sum¬ 
mer in another metal products factory. Alter completing his naming he 
was placed, by the school placement service, in a job with a manulac- 
turer ol railway locomotives. Nine months had elapsed on this job at 
the time oi writing, and Tom lelt that he had successfully begun a career 
of the type which appealed to him most, with a concern which would 
offer him security and advancement. 

Validity of the Appuiisal. The plans carried out by Tom seem to 
have cor responded rather c loselv with the appraisal of the counselor in¬ 
sofar as type of activity is concerned, although there was some floundering 
at the start and the ultimate achievement level was the highest ol those 
which had been deemed likely. It will be remembeied that the counselor 
had thought of apprenticeship or on-the-job training as equally appio- 
priate in Toni's case as lormal training in a technical institute. Although 
the interruption in the apprentice training which he began alter com¬ 
pleting high school seems to have been clue to factors not related to his 
interests or abilities, the lact remains that it was interrupted, that a 
period ol work followed which served to finance schooling and to con¬ 
firm his desire for it, and that in the end he graduated from a technical 
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institute and obtained employment at the skilled level. The only way in 
which Tom’s history differed from that discussed in the counseling 
process was in the selection of steam and diesel rather than gasoline 
engines, but this difference is, from the standpoint of aptitudes and basic 
interests, quite insignificant. 

One might conclude from this one case that test results and the 
counselor’s diagnoses tend to correspond rather closely, were it not 
for the fact that while a case may illustrate it cannot prove. T he case of 
Marjorie Miller, which lollows, serves to bring out the complexity of 
people and of occupations, and to underline the fact that vocational 
adjustment is often a process of unfolding rather than of predicting. 

Exercise i^. 

Compare your interpretation of the test data and the plans which you con¬ 
sidered suitable with the report of the subsequent history of this client. In what 
ways were your ideas on the case borne out by experience? In what ways do you 
seem to have been wrong? What do )ou think may have been the causes of your 
mistakes or of the mistakes which the tests led you to make? How do discrep¬ 
ancies between your test interpretations and the outcomes of the case add to 
your understanding of the validity data reported in investigations using the 
tests? 

The Early Career of Marjorie Miller (see also p. 592 and p. G09) 

Subsequent Histoiy. Marjoiie carried out the decision reached with 
the aid of the counselor and applied for the special scholarship at the 
high-ranking college. Drawing 011 the diagnostic data made available 
by the counselor, the principal gave her an extremely favorable and yet 
objective recommendation. She was awarded the scholarship, which pro¬ 
vided all she needed to supplement her family’s financial backing, for 
her four years in college. At the end of her freshman year the counselor 
had a letter from Marjorie, expressing her appreciation of the educa¬ 
tional experience which he had helped her to obtain, and describing 
some of her reactions to her first year. Apparently her horizons had been 
so broadened by the experience that she felt considerable gratitude to 
the counselor for having made her aware of the advantages of the type of 
college she was attending and for having found a way to make it finan¬ 
cially possible. The next contact came at the end of Marjorie’s college 
career, when the counselor received an announcement of the graduation 
ceremonies in which Marjorie was to participate. The third follow-up 
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was made through one of the college personnel officers, a year after 
Marjorie had graduated and six years after the counseling took place. 

Marjorie’s freshman program in college included chemistry, economics, 
English, and German. Her grades for the year were C, C, B and B, re¬ 
spectively. Her college personnel record showed that her goal, at the 
beginning of this year, was “nutritionist, chemist, or social worker.” 
Late in her freshman year she discussed her choice of major field wilh 
a counselor, who was impressed by her intelligence, viewpoint, and 
enthusiasm. She talked also with the heads of the science departments 
in which she was most interested. 

During her first summer vacation Marjorie worked as a sales clerk in 
a department store, and acted as head of the department in which she 
worked. Her employer reported that “Marjorie has better than average 
intelligence, good initiative, and excellent character. While at work in 
my store she handled selling duties very well although she had had no 
previous training in this field. She is ambitious and would succeed in any 
work which she undertook.” 

In her second year in college Marjorie apparently shifted from hei 
former scientific inclinations and majored in child study. She took four 
courses in this subject, continued economics and German, and added 
physiology and psychology. Her marks for the year were all B+ or B. She 
continued along these lines during her third and fourth years, concen¬ 
trating more and more on psychology and child study. Her marks im¬ 
proved steadily, and she graduated 85th in a class of about 250 students. 

Marjorie’s extracurricular activities consisted of working on the college 
newspaper, as a reporter during her first year and as assistant managing 
editor of a new and rival publication during her sophomore year. She 
was an officer, and ultimately president, of a campus religious organiza¬ 
tion. She served as co-editor of her class yearbook in her senior year. 

In her last summer vacation Marjorie took a position as a playground 
instructor in one of the large cities, receiving ratings of “excellent” in 
industry, ability, attitude, and attendance. She also did field work with 
children in a local settlement house as part of her academic work during 
the year. Her supervisor’s report read: “She showed a fine understanding 
of the needs of individual children, was responsible for completing tasks 
assigned to her, and showed initiative in many situations where students 
frequently wait for direction. She has a friendly personality and adjusts 
easily to new situations.” Another supervisor spoke of her “good sense of 
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orientation, quick grasp of problems. What is more, she showed good 
intellectual and emotional insight into the life of children.” 

When Marjorie registered with the placement ollice during her senior 
year she stated that she wanted to teach in a public or private nursery 
school. In a later contact she expressed the same interest, but hesitated 
about applying for specific openings because she was thinking of 
marrying soon and really wanted a job more than a career. A little later 
she expressed an interest in an opening as a field secretary for one of the 
scouting organizations, had an interview with a representative of the 
national office, and was employed in a branch near her home town. 

A filial iollow-up revealed that Marjorie was doing well in her work, 
found it very satisfying, and had been promoted to a more responsible 
position in the same organization. Although she still had marriage in 
mind, it had receded into the background at least temporarily, and she 
looked forward to continuing in the same work for the foreseeable future. 

Validity of the Appraisal. Marjorie’s grades in college were in line 
with the counselor’s expectations, when he disagreed with the high-school 
principal’s characterization of the girl as brilliant. She did prove to be, 
as he anticipated, a good student in her chosen field, graduating at the 
bottom of the upper third of her class. It is interesting to note, however, 
that her work in science and freshman (but not sophomore) economics 
was only at the C level, ller achievement in the more verbal subjects 
was better than that in the more quantitative, as suggested by the analysis 
of her school grades. This trend was not, however, clearly foreshadowed 
by the test scores. 

Of major interest is the predictive value of the interest inventories. 
These, it will be remembered, showed dominant interest in the scientific 
fields, with some signs of intciest in social welfare and religion. Her 
school and leisure-time activities did not do much to decide the issue 
one way or the other, as they included scientific and social interests. 
The subsequent history showed that, contrary to the counselor’s expec¬ 
tation, the secondary social welfare interest pattern became dominant 
as time went by. It has been seen that Marjorie carried out the program 
of exploration in both scientific and social areas which the counselor had 
recommended for her freshman year, and that, whether because of 
interest, aptitude, or some combination of the two, she then focused 
entirely on the social welfare field. 

The counselor’s private opinion, then, which he did not let influence 
his counseling, was mistaken. He had thought that exploration would 
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confirm Marjorie in her adolescent choice of a scientific occupation, 
whereas in fact she decided to prepare for and actually entered a social 
welfare occupation. Stated as flatly as this, the outcome might lead one 
to conclude that the test results had actually been misleading in this 
case. But reconsideration of the data will reveal that the basis of Marj¬ 
orie’s subsequent actions can be seen in the high-school counseling 
record. Subordinate though the trends seemed to the counselor at the 
time, there were indications of social welfare interests. Isolated from the 
rest of the pattern these indices are rather impressive: 

Unusual reading speed. 

Verbal grades superior to quantitative. 

Secondary social well are interest pattern. 

Feminine (i. e. social and literary) interests. 

Active on school paper. 

Dramatic club member. 

Scout leader. 

The foundations for the choice of a social welfare or literary occupa¬ 
tion were clearly there. It was only the more dominant interest in science 
suppot ted by superior achievement in the sciences and equally important 
scientific avocations which led the counselor to believe that success and 
satisfaction were most likely to lie in the applied sciences. 

Perhaps the principal conclusion to be drawn from this case, however, 
is that e\en in the case of some well-motivated, clear-thinking, able high- 
school seniors interests and abilities are still in the process of developing 
or, at least, of coming to the surface of consciousness. When more than 
one pattern of abilities and interests is noted, it is therefore wise for the 
student to plan a program of study, work, and leisure which provides for 
further exploration ol the two or three dominant patterns. The diag¬ 
nostic process may serve to reveal areas in which exploration can best 
be concentrated, and counseling may have as its function the planning 
of appropriate types of exploratory activities. Actual decision making may 
not come for some time, and then it will turn out to be a step-by-step 
process rather than an event. 

Exercise 16. 

Compare your interpretation of the test data and the plans which you con¬ 
sidered suitable with the report of the subsequent history of this client. In what 
ways were your ideas on the case borne out by experience? In what ways do you 
seem to have been wrong? What do you think may have been the causes of your 
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mistakes or of the mistakes which the tests led you to make? How do discrep¬ 
ancies between your test interpretations and the outcomes of the case add to 
your understanding of the validity data reported in investigations using the 
tests? 

The Early Career of Ralph Sheridan (see also p. 595 and p. 612) 

Subsequent History. In response to a follow-up letter written six years 
after counseling, Ralph wrote in part: 

At the time, I was planning to enter Renssalaer Polytechnic Institute. The tests 
showed that while I might make a fairly decent showing at engineering, I would 
probably do better at other things. I think they were too percent right. 

I entered Renssalaer, but was obliged to withdraw during the third semester by 
the death of my brother. My marks were such that I could have got by, but not 
much more. 

I then went to work in an abrasive factory, at a semiskilled job which I liked 
rather well but found monotonous. Then I entered the Seebees, and enjoyed 
the construction work we did at various Naval installations. 

Upon my discharge, I went back to the factory as a foreman, then rose to as¬ 
sistant to the superintendent. Since then there has been a decline in the volume 
of production, and consequently a reduction in the number of employees. 1 
have worked at various temporary jobs in the same plant since then, none of 
them significant, just to keep on working until the beginning of the Fall term. 

1 plan to enter the Syracuse University School of Business this Fall, which you 
suggested (very wisely, I now believe) as a suitable plan six years ago. 1 think 
that your analysis of my abilities and interests was quite accurate, and can say 
that, even though I did not act upon it then, it has helped me to understand 
my subsequent experiences and to make plans based upon my assets as they have 
been demonstrated in my work. 

Validity of the Appraisal. It is interesting to note that the client has 
emphasized, in retrospect, the strength oi the suggestion of business 
administration training made by the counselor and the definiteness of 
the verdict of the tests (“they were 100 percent right"). Such oversimpli¬ 
fication on the part of counselecs is common, and should serve as 
something of a deterrent to counselors who tend to be over directive. The 
data themselves are generally quite sufficiently directive: if not, more 
direction in the form of advice from the counselor may well be harmful. 

The validity of the appraisal is demonstrated by several facts in 
Ralph’s experience. First, there is the mediocre record in engineering 
school, confirming the diagnosis of weakness in that area. Secondly, there 
is the success in the administrative side of production work, which led 
to promotion to foreman and assistant to the superintendent. And 
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finally there is the client’s conclusion, from these and his other expeii- 
ences, that business administration training would be most in line, with 
his interests and abilities and would best equip him for work of the type 
he wanted. 

The case is interesting, also, in that it illustrates how a counselee may 
reject the implications of tests and counseling, proceed to try out his 
plans, and ultimately revise them to make them conform with the origi¬ 
nal counseling. In Ralph's case, as in many others, the counselee seems 
no worse off for having found things out “the hard way”; in fact, he may 
have learned some useful lessons as a result, and may profit more from 
his subsequent education that he otherwise would have. But testing and 
counseling seem to have been valuable in foiearming him, in making it 
easy for him to learn from experience and to revise his plans as needed. 
Testing and counseling, then, converted floundering into exploration. 

Exercise ly. 

Compare your interpretation of the test data and the plans which you con¬ 
sidered suitable with the report of the subsequent history of this client. In w r hat 
ways were your ideas on the case borne out by experience? In what ways do you 
seem to have been wrong? What do you think may have been the causes of youi 
mistakes or of the mistakes which the tests led you to make? How do discrep¬ 
ancies between your test interpretations and the outcomes of the case add to 
your understanding of the validity data reported in investigations using the 
tests? 

The Early Career of Paul Manuelli (see also p. 597 and p. 613) 

Subsequent History. Paul, like Ralph, was heard from by mail six 
years after he graduated from high school. He wrote as follows: 

1 graduated horn high school with honors, a medal for excellence in United 
States History, and a scholarship at Carnegie Tech. 

1 spent my freshman veai at Tech, studying mechanical engineering. I was per¬ 
mitted to omit Freshman English, taking Literature in its place. I played on the 
Varsity Football squad. I made a C-f- average, which was all right as my scholar¬ 
ship was based more on athletics than on academic achievement. 

After we got into the war I joined the Navy V-12 program and was transferred 
to Stevens Institute, where I graduated with a B.S. in Mechanical Engineering 
in 1945, and grades averaging from 75 to 80. I made the Dean’s List once, in 
my junior year, played on the football team, belonged to the senior honorary 
society, was class orator, was listed in the American student “Who’s Who,” was 
company commander, was on various student committees, and took part in 
theatrical productions rather regularly. 
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Transferred to Columbia Midshipmen’s School, I was commissioned an Ensign 
in the Naval Reserve, graduating 259th in a class of more than 1000 men. I 
served as company commander here also. 

I was assigned to duty on a cruiser, as a junior officer in firerooms, machine- 
shop, and main engines. Studying under the chief engineer, after seven months 
aboard I qualified as engineering watch officer and stood watches, in complete 
charge of the engineering facilities. 

My discharge came late in 1946, after which I joined the Atlas Corporation as a 
student engineer. I completed a year’s study in which I spent several months in 
each of their main divisions, learning all the operations from design to sales and 
service. After completing this course, I requested assignment to production en¬ 
gineering; I could have asked for development engineering, but I felt that I 
would be better qualified lor development woik if I became thoroughly familiar 
with the problems of improving existing designs, making them easier to manu¬ 
facture, etc., beiore trying development work. I would like ultimately to be a 
senior engineer with a department ol my own, but of course that is a long-term 
objective. Before that happened, I think I might be tempted to shift to factory 
administration, as I enjoy working with people and handling long-range 
problems. 

I enjoyed my schooling, and even though 1 never made top grades 1 never had 
any w-orries about passing. I got as much out of college as most students, and 
never disliked any courses; 1 suppose I cared least for drafting, as drawing prints 
is an anti-climax after actually sohing a problem. The Navy was all right too. 
My work with Atlas has gone smoothly, and 1 have never felt unprepared or 
unable to handle the work that has come my way. 

I have had tw r o substantial raises since coming to Atlas, and consider my rate of 
progress satisfactory. I have been given a fair amount of iesponsibility, having 
been sent to deal w r ith other companies with power to purchase, alter designs, 
and in other ways represent Atlas. 

Validity of the Appraisal. The subsequent history of Paul Manuelli 
confirms, in its general outline's, the appraisal made by the counselor 
and is in line with the plans discussed in counseling. Although the war 
changed the details of Paul’s education, he developed in ways which had 
been anticipated in counseling. Some of the minor, specific, ways in 
which his history did difler from the iorecast and planning arc of interest, 
and arc taken up below. 

One of the counselor’s misgivings, it was pointed out, concerned Paul’s 
low spatial visualization score. This had been discussed with Paul as an 
indication that he might do well to check his performance in drafting 
and related types of activities, and that trouble there might lead him to 
shift to a field requiring less spatial ability. The subsequent history 
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shows no actual difficulty with spatial visualization, but the fact that 
his grades were not as good as the other indices would have led one to 
expect may be due to weakness in this special area. It is perhaps signifi¬ 
cant, too, that drafting was the one subject that Paul liked least. Al¬ 
though the stated reason was the lack of intellectual challenge, it is 
quite possible that the underlying reason was difficulty in transferring 
spatial concepts to paper. There are many men as bright as Paul, with 
interests just as intellectual, who enjoy seeing an idea take shape on the 
drafting board. 

If it is indeed true that Paul is somewhat handicapped by low ability 
to visualize space relations, his future development will be worth noting, 
for both production and development engineering, in the mechanical 
field, should require considerable ability of this type. One might hazard 
the guess that Paul may well eventually “be tempted to shift to factory 
administration” not only because of his interest in people and planning, 
but also because of frustration in the more technical aspects of develop¬ 
ment work. 

The leadership qualities seen in Paul’s high school extracurricular re¬ 
told and reflected in his lather high business contact interests on Strong’s 
Blank continued ter manifest themselves duiing his college, Navy, and 
business career. These also suggest that, once he is firmly established as 
an engineer in his company, he may want to change to administrative 
rather than technical woik. On the whole, Paul’s leadership record is 
superior to his academic lecoid. 

Although it is not concerned with testing, one final point is of interest. 
The counselor was appaiently undidv pessimistic about the ability of 
this student to finance his way through college, pessimism which seems to 
have been quite unwarranted in view of the award of the four-year athletic 
scholarship. In this respect at least, the student may have shown more 
sax>()ir jane than the counselor. 

Further follow-up would be highly desirable in this case, not in order 
to provide more guidance (Paul seems to be handling his career very well), 
but in order to see which predominates in the end, his technical interests 
and abilities or his social interests and abilities. In the meantime, it may 
be concluded that development has been very much like that anticipated 
in the counseling piocess. 

Exercise iS. 

Compare your interpretation of the test data and the plans which you con¬ 
sidered suitable with the report of the subsequent history of this client. In what 
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ways were your ideas on the case borne out by experience? In what ways do you 
seem to have been wrong? What do you think may have been the causes of 
your mistakes or of the mistakes which the tests led you to make? How do dis¬ 
crepancies between your test interpretations and the outcomes of the case add 
to your understanding of the validity data reported in investigations using the 
tests? 

The Early Career of James G. Revere (see also p. 599 and p. 615) 

Subsequent History. The follow-up of James Revere took place after 
a lapse of only a few months, a much briefer period than in the cases of 
the other counselees. The follow-up data are therefore not very helpful, 
for insufficient time had elapsed since counseling for the shape of events 
to become clear. The case has been included because it illustrates, better 
than most recorded cases, the complexity of the vocational adjustment 
problems presented by many men and women of about 30 years of age. 
Briefly, Mr. Revere obtained a selling job in which he thought he might 
use his previous training and in which he recieved training and supervi¬ 
sion as a beginning salesman. A letter from him indicated that he thought 
he was off to a good start in this new field. 

Validity of the Appraisal. It is still too early, at the time of writing, to 
judge the adequacy of the work done with this client. 

Exercise 19. 

In view of the absence of follow-up material, no evaluation of interpretation 
can be made for this case. 

The Early Career of Ruth Ann Desmond (see also p. boi and p. 620) 

Subsequent History. A letter follow-up for this case more than two 
years after counseling brought information which was as surprising as the 
counselor’s misgivings might have led him to expect. After working with 
the law concern as a secretary for one year, Miss Desmond gave up the job 
and went to Pittsburgh. She accepted a teaching fellowship at the Univer¬ 
sity of Pittsburgh, where she taught accounting and began work toward 
the master’s degree in business administration. At the same time, she en¬ 
rolled as a student of drama at the Carnegie Institute of Technology, 
carrying a program there which would lead to a master’s degree in dra¬ 
matics. She completed both of these programs, but insufficient time had 
elapsed at the time of her letter for her subsequent career to have taken 
shape. 

Validity of the Appraisal. The present outcome of this case is unlike 
anything anticipated in the analysis of test and personal data by the 
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counselor or in the counseling interviews. A rereading of the case (p. 601 
and 620) will remind the reader that the counselor had thought of statis¬ 
tics (machine work) and secretarial work leading to administrative 
responsibilities as suitable outlets for Miss Desmond, and that her em¬ 
ployment by the law concern was appropriate. The fact that she left it 
alter about a year suggests that it was not actually so, although the reason 
is not clear. Perhaps her poor family relations had something to do with 
it. In the absence of detailed interview or projective test material one can 
only speculate. 

It may be more profitable to inquire whether there was something 
pulling her toward the field of dramatics, than to speculate as to what 
drove her out of secretarial work. Her high literary-legal interest scores 
on the Strong and Kudcr may have had something to do with it; moder¬ 
ately high artistic interest on the Strong Blank may also have played a 
part. If her good personality inventory scores were, as suggested, compen¬ 
sator) in nature, a seaich lor artistic and literary outlets for her emotions 
may have been another factor. 

Most important of all, in the writer’s mind, is the uncomfortable feel¬ 
ing he had in working with and closing the case. This feeling carried with 
it the conviction that there were unsolved problems in Miss Desmond’s 
case which would keep her unsettled and on the move in search ol happi¬ 
ness. This insight, or perhaps it was only a hunch, was not the result ol 
tests or test data, but rather of interview data and observation. Whether 
or not it was correct can be ascertained only by her subsequent history, 
but that part ol it which had elapsed at the time of her letter was not 
reassuring. The carrying of two such different loads as those taken on in 
Pittsburgh hardly seems like a normal or well-conceived plan. 

Exercise 20. 

Compare your interpretation of the test data and the plans which you con¬ 
sidered suitable with the report of the subsequent history of this client. In what 
ways were your ideas on the case borne out by experience? In what ways do you 
seem to have been wrong? What do you think may have been the causes of your 
mistakes or of the mistakes which the tests led you to make? How do disciep- 
ancies between your test interpretations and the outcomes of the case add to 
your understanding of the validity data reported in investigations using the 
tests? 

The Early Career of James L. Johnson (see also p. 604 and p. 623) 

Subsequent History. The hoped-for opportunity to accept employment 
as administrative assistant to the head of a large business enterprise ma- 
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terialized, and Mr. Johnson wrote the counselor a week or so after the 
final interview that he had accepted the offer. He was very pleased with 
the nature of the work, with the associates with whom it threw him, and 
the excellent salary which he was paid. He expressed his appreciation 
of the counselor’s services in helping him to clarify his objectives and to 
carry out his job-hunting campaign. 

A follow-up letter was sent to this client two and one-half years after 
he took the job as administrative assistant, expressing an interest in know¬ 
ing how he liked the work, the nature of his subsequent experiences, and 
what he thought of his present situation. An immediate reply was re¬ 
lieved, a brief but friendly letter in which Mr. Johnson summarized the 
experience of the intervening two years. He was still working in the same 
position, and had had a substantial raise at the end of his first year. He 
felt that the prospects with his present company were excellent, and lie 
had developed contacts through his work which might well lead to other 
opportunities should he be interested. The work had proved to be very 
much to his taste: detail was taken care of by a competent office force, 
and his own duties involved development work and contacts with a vari¬ 
ety of executives both in the company and in other concerns. There was 
no indication, in the letter, of anxiety or difficulty in interpersonal rela¬ 
tions such as it was thought might develop at the time of counseling. 
While failure of such signs to appear in a letter proves little, it did seem 
significant that a formerly dissatisfied man wrote a letter in which satis¬ 
faction was the only manifest attitude. 

Validity of the Appraisal. In this case, unlike Marjorie Miller’s, the 
only follow-up data are general. They give blanket confirmation of the 
appraisal made at the time of counseling, insofar as type of activity in 
which success and satisfaction might be found were concerned. The client’s 
work was of a type which counselor and client had agreed should prove 
satisfactory, and it had proved satisfactory over a significantly long period 
of trial. 

The counselor’s belief that emotional maladjustment might create 
difficulties for Mr. Johnson did not seem to be substantiated. As was 
pointed out, however, the failure to find confirmation of this belief in a 
letter hardly constitutes proof. The fact that the letter expresses satisfac¬ 
tion with the job and with its prospects may nevertheless be taken as some 
evidence of the fact that, as in the past, Mr. Johnson was handling what¬ 
ever emotional problems he might have with considerable success, win¬ 
ning the confidence of his employers and carrying out his work effectively 
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Exercise 21. 

Compare your interpretation of the test data and the plans which you con¬ 
sidered suitable with the report of the subsequent history of this client. In what 
ways were your ideas on the case borne out by experience? In what ways do you 
seem to have been wrong? What do you think may have been the causes of your 
mistakes or of the mistakes which the tests led you to make? How do discrep¬ 
ancies between your test interpretations and the outcomes of the case add to 
your understanding of the validity data reported in investigations using the 
tests? 


Conclusions 

The seven eases summarized and discussed in these chapters have served 
to illustrate the nature and use of data from a variety of tests, together 
with the need for personal data as a background against which to inter¬ 
pret test lesults. T hey illustrate the way in which tests sometimes serve to 
pi edic t with considerable accuracy the type of field in which success will 
be found (Stiles, Sheridan), sometimes foreshadow’ in a general w r ay devel¬ 
opments which they cannot forecast (Miller, Manuelli, Johnson), and 
in still other cases lea\e one w T ith a baffled feeling cf not having gotten 
to the heart of the matter (Revere, Desmond). I11 some cases the tests 
welded important insights which could not well have been obtained by 
other means (Sheridan, Revere), in others they merely seemed to confirm 
what other data revealed (Stiles, Manuelli), and in still others they con¬ 
tributed to an understanding of the client but did not point the way to 
immediate solutions (Miller, Johnson, Desmond). 

Problems of test interpretation for vocational counseling at several 
difleient levels ha\e been illustrated. Four cases were high school students, 
one of them considei ing a skilled trade, three of them college majors. One 
was a young adult with a high school education, concerned about progress 
in the clerical or subprofessional technical field. One was a recent college 
giaduatc, dissatisfied with the occupations for which her major field had 
prepared her. And one was a man in his mid-thirties, seeking to re-estab¬ 
lish himself on a higher plane after the war than that at which he had 
woi ked before the Avar. 

Many more cases would be necessary in order to illustrate all the points 
which a user of tests should be familiar with in practice. But the problems 
presented in this chapter, and the opportunity provided for the student 
10 work out his own answers to them before reading what actually tran¬ 
spired, should provide sufficient exercise with “paper cases." ft now 
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becomes incumbent upon the student-counselor to test some live clients, 
analyze the test results and relevant personal data, prepare psychometric 
reports in which he draws on all of the knowledge of the educational and 
occupational significance of tests which the study of this book and of the 
original studies on which it is based should have given him, and obtain 
the criticism of a qualified supervisor. As he works with students or clients, 
and makes his own formal or informal follow-up studies, he will gain 
that richer understanding of tests, of occupations, and of vocational and 
clinical psychology which is the earmark of the well-rounded counselor or 
personnel worker. 
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STATISTICAL CONCEPTS 


THIS appendix consists of two sections, the first dealing with common statistical 
terminology, the second with the concept of prediction as applied to psycho¬ 
logical testing in vocational guidance and selection. The first section is ele¬ 
mentary and is included only as an aid to those who may not have the back¬ 
ground in measurement which the reading of this book and of the literature on 
lest validities requires. It docs not attempt to serve as a manual of statistics or 
to provide sets of statistical tables. Those are available in Garrett (282) and 
Walker (<)of>) . It should be skipped by all who are familiar with the terminology 
and concepts of statistics. The second section is of more general interest and 
contains material important to many readers who have some knowledge of 
statistics. It also emphasizes logic rather than formulae or tables (see p. G54). 

Statistics: Quantified Reasoning 

Those who are not used to working with numbers, or who have had in¬ 
adequate instruction in mathematics in high school, often approach statistics, and 
even reports which include statistically piesented data, with their minds made 
up that they cannot understand them. Such readers should bear two things in 
mind. First, as shown in Chapter 6. the relationship between verbal ability and 
achievement in mathematics at the senior high school and college levels is as 
close as that between quantitative aptitude and mathematics: dierefore an in¬ 
telligent person can learn what he needs to in the way of high school or ele¬ 
mentary college mathematics. Second, statistics is nothing more than logic ex¬ 
pressed in numerical form: therefore any reader who can engage in logical 
reasoning can master elementary statistics, and those who enjoy logic should 
enjoy statistics. 

It is not the purpose of this section to convey any knowledge of statistical 
formulae and computation. That is not necessary for the reading or under¬ 
standing of this book. But as an understanding of the concepts of statistics is 
necessary both for the reading of books such as this and for the interpretation 
of tests, the following paragraphs attempt to explain briefly the meaning of the 
commonly used statistical terms. 


643 



044 APPRAISING VOCATIONAL FITNESS 

Central Tendency 

Tests are measuring instruments. Measurement generally involves the com¬ 
parison of one entity with some other entity. This may be the weight of a person 
and of some pieces of iron, the length of a table and of a standard-sized object 
called an inch (in French an inch is a “pouce,” or thumb), or the number of 
words understood by a first grader and the number understood by other first 
graders. In psychological work the comparison oi a person with something 
usually requires that he be compared with other persons. Although one (an 
simply count the number of words understood by a first grader, that number 
has little meaning unless one also knows how many are understood by the 
average first grader. 

In order to make such comparisons, the typical achievements, aptitudes, in¬ 
terests, or personality traits of various kinds of groups must be expressed in 
summary lorm. Numerical summari/ations of group characteristics are called 
measures of central tendency. Measures of central tendency are average's. Aver¬ 
ages can be expressed in several ways, as medians, means, or modes. 

Median. The median (Md) is the middle person or middle score in a dis¬ 
tribution of persons or scores. In a group of five boys standing in the order of 
their height, the third boy from either end is the median boy. This method of 
expressing averages is commonly used when the number of cases is not large, 
when a few extreme cases might distort other indices of central tendency, and 
when a quick estimate is desired. 

Mean. The mean (M) is what is most often meant when, in everyday lan¬ 
guage, we talk about averages. It is computed by adding up the ages, heights, or 
I. Q.’s of all persons in the group and dividing the sum by the total number of 
cases. The mean is the most widely used statistical measure of central tendency 
because it is part of a system which lends itself to many types of manipulation. 
When the number of cases is small, however, it can be seriously distorted by a 
few extreme cases. 

Mode. T he mode is the height, age, or test score which is most common in 
the group. In a perfectly normal distribution it is identical with the median and 
mean, but in skewed distributions it is not the same. It is ascertained by inspec¬ 
tion, to see where most cases fall. A distribution may be bimodal, or have two 
modes, in very special cases, but it would still have only one mean and only one 
median. The mode is rarely used, however, because it does not lend itself to as 
wide a variety of applications as the mean or even the median. 

Dispersion 

In order to describe the status of a group one must know not only what is 
typical, but also the extent to which the group varies from its own norm. One 
company’s salesmen may be a very homogeneous group in so far as intelligence 
test scores are concerned, most ol them having I. Q.’s very dose to the mean oi 



APPENDIX A 


645 


no; in another company the mean may be almost identical, let us say in, but 
the variability greater, some making much lower scores and others much higher. 
Despite comparable means, the two groups are clearly different in intelligence. 
Measures of dispersion are used to describe the extent to which the cases cluster 
atomic! the average or scatter away from it. These include the range, inter¬ 
quartile range, and standard deviation, plus some others which are less widely 
used. 

Range. The range is the simplest and crudest measure of dispersion. It is 
the difleicute between the lowest and the highest cases. Thus the ranges of 
I. Q.’s in the groups of salesmen just mentioned may be from 105 to 110 in the 
fust group, and from 95 to 121 in the second. It is not a good measure of 
variability, because a few extreme cases may give the appearance of considerable 
dispersion when most cases actually cluster close to the median. 

/ntercjuai tile Range. The interquartile range includes the middle fifty percent 
of the group. By telling how far out from the average this half of the group 
spieads, it gives a reasonably good idea of how representative the average is of 
the group as a whole. The semi-interquartile range, the distance including 25 
percent of the cases on one side of the median, is also sometimes used. These 
measures are used v\tth the median, and are part of the percentile system . An¬ 
other way of describing the interquartile range is to say that it extends from the 
25th percentile to the 75th. It is computed by finding the test score which is 
made In the person who is one-fourth of the way up from the bottom of the 
distribution of scores, and that which is made by the person who is three-fourths 
of the way up. These two scores or points on the distribution are called the 
fust and third quartdes (()). The median is, of course, the second quartile, and 
the high end of the range is the fourth. The term interquartile range is not 
often used, but Q, and Q,. the first and third quartdes. are generally used with 
tire median in order to describe the variability of the group. 

Standout Demotion. The standard deviation (sigma or <r) is the measure of 
dispersion commonly used to describe the variability of groups for which the 
means have* been ascertained. It is virtually an average of the distances of all 
the scores in a distribution from their own average score. Means and standard 
deviations a it* part of the moment system, just as medians and quartdes are 
pari of the percentile system. The standard deviation is, with the mean, the 
more commonly used measure, because it lends itsell most readily to use in other 
formulas. The distance between one sigma either side of the mean of a normal 
distribution includes, not 50 percent of the cases as in the interquartile range, 
but the middle 68 percent. This number may seem awkward, but there is no 
special virtue in 50 percent, and the standard deviation actually gives a some¬ 
what truer picture of the scattering of cases or scores around the mean. 

One fundamental difference between the percentile and moment systems 
should be kept in mind: percentile, quartile, and other such scores are based 
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on the numbn of cases, but standard deviations or sigmas arc based on distances 
from the mean. The latter are therefore more truly measuring sticks in the usual 
sense of the term, and give a better idea of dispersion than do the former. 

Methods of Expressing Scores 

Test scores are commonly expressed as raw, percentile, or standard scores 
The raw score is simply the number of problems correctly solved, the number ol 
words known, or some other index of work done. It therefore needs no special 
defining. But raw scores are not meaningful until they are converted into some 
other type of score which shows the examinee’s standing with regard to a group 
of persons in the same occupation, school grade, or problem-group. Percentile 
scores and standard scores have their own special virtues and defects, which must 
be understood to be wisely used. 

Percentile Scores or Ranks. As was indicated above, percentile ranks are 
based on the frequency with which cases fall at given points on the st ale. They 
have the advantage of being based on a concept which is familiar to edmatois 
and to people in general, who readily grasp the significance of the statement 
that a college senior has more mechanical comprehension than Ho out ol 100 
applicants for engineering positions, or more ability to perceive differences in 
pairs of numbers than only to out of 100 clerical workers. This is what a 
percentile score tells one, when cases are counted from the bottom upward 
(the usual method). But just because it is based on counting cases this system 
has the defect of making differences near the median seem greater than they are, 
and of disguising differences at the extremes. To illustrate this latter defect, it 
is necessary only to point out that two people, one with an I. Q. of 130 and the 
other with an I. Q. of 180 are both at the 99th percentile; both are more in¬ 
telligent than 99 out of 100 persons, but the difference in their mental ability 
is very great. If one used only the percentile score, as is done with most aptitude 
tests, the difference would be hidden. 

Standard Scores. Standard scores, being based on distances from the mean, 
provide sensitive indices of abilities and traits. Most systems arbitrarily assign 
a standard score of 50 to the mean raw score, and make 10 standard score points 
the equivalent of one standard deviation in raw scores. Thus if the mean raw 
score in a normal distribution is 124, the mean standard score is arbitrarily 
called 50. If the standard deviation (the distance either side of the mean which 
includes 68 percent of the cases) of these raw scores is 40, then 124 (the mean 
raw score) plus 40 (the standard deviation in raw score points) equals 164, which 
is a standard score of 60. If two sigmas in raw score points were added to the 
mean raw score, one would have 124 plus 80, or 204; this would equal a standard 
score of 70. The mean raw score minus one sigma (124-40) equals 84, which is 
a standard score of 40. Minus two sigma would be a raw score of 44, or a 
standard score of 30. Most actual standard scores are between 30 and 70. 
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Some standard score systems differ slightly in method of expressing scores but 
not in logic. 7 -scores are sometimes the same as those described here, but some¬ 
times consist of too units of .1 sigma each, ranging from —50 to 50. Sigma scores 
are the same as standard scores except that they use only one digit to the left of 
the decimal point, and therefore one or two to the right (mean equals 5.0). It is 
possible also to designate the mean as zero, and to use true sigma scores, showing 
stores as positive or negative (one sigma above the mean would be 1.0, one below 
— 1.0). Army standard scores, used in the Army General Classification l est, have 
a mean ot 100 and a standard deviation of 20, and range from 40 to 160, three 
sigma below and above the mean. 

The Significance of a Difference 

When the test scores of two different groups are being compared, it is im¬ 
portant not only to have measures of central tendency and dispersion, but also 
of the significance of whatever differences arc found between these measures. 
One must ask not only whether the mean score of one group is higher than that 
of the other, but also whether it is sufficiently higher for one to have some con¬ 
fidence that future samples of the same populations will differ from each other 
in the same way. In asking these questions, one passes from descriptive statistics 
to the statistics of inference: instead of describing the status of a group, one 
generalizes from a known group to other similar but unobserved groups. An 
objective answer to this question is provided by measures of the significance of 
a difference, the most common of which is the critical ratio (the t-test). 

The Critical Ratio. The most common method of determining the likelihood 
that the differentc between two groups is reliable is to divide the difference be¬ 
tween the mean scores ol the two groups by the standard error of the difference 
between the means of the two groups. The resulting statistic is called the critical 
ratio (C.R.). This can in turn be converted into an expression of probability (p), 
that is, a statement concerning the number of times in 100 that the obtained 
difference might be found strictly as a result of chance. This procedure is also 
known as the t-test. When the number of cases exceeds 30, a critical ratio of 
2.00 means that there are 5 chances in 100 that a difference such as that obtained 
between the two groups would be found in a situation in which there were 
1 tally 110 differences. A critical ratio of 2.50 shows that there is only one chance 
in 100 that the difference was due to chance factors. A critical ratio of 3.0o 
means virtual certainty that the observed difference is a real difference, for 
chance could produce a difference of that order only three times in 1000. 

Decisions as to what actually constitutes a significant difference vary partly 
with the conservatism of the judge and, more legitimately, with the nature of the 
decisions to be made. A man out for a stroll might well avoid a bridge if there 
were 5 chances in 100 that it would collapse under his weight, for he could just 
as well walk elsewhere with less risk. But he might gladly cross it with only a 
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50-50 chance ol its supporting his weight if the safety of his small child de¬ 
pended upon his doing it. 

Relationship 

In order to understand what a test measures, one must know what scores on 
that test are related to. and the degree of that relationship. If a test is supposed 
to measure understanding of mechanical principles and processes, there should 
be available evidence that it is related to success in mechanical activities. The 
common measures of relationship are coefficients of correlation. T here are a 
'variety of these, using some form of the letter r as symbol: product-moment or 
zero-order, rank-order, biserial, tetrachoric, partial, and multiple correlation 
coefficients. There are other measures of relationship such as the coefficient of 
contingency (C) and the correlation ratio (eta), but as these are not commonly 
used in analyzing test data they are not discussed here. 

Product-Moment Correlation . When the results of two measures are expressed 
in terms of scales they can be related to each other by the product moment or 
Pearson correlation coefficient (r). This is the common method of determining 
such things as the extent to which intelligence and school marks vary with each 
other and the degree of association between sales interest and success in selling 
life insurance. These correlations can be plotted graphically, as in f igure 26. 

The scale on the left-hand side of the graph shows intelligence test scores 
(I. Q.’s): the higher the score on the scale, the more intelligent the individual. 
The scale on the base line shows occupational level: the further to the right 
one goes, the higher the person’s standing on the occupational ladder. An in¬ 
dividual who makes an intelligence quotient of 110 and an occupational level 
score of 75 is represented by a stroke in the cell or box located at the intersec¬ 
tion of the perpendiculars (broken lines) leading from the points corresponding 
to his scores on the side and base lines. Inspection of these strokes shows that, 
the higher the location of an individual on the intelligence scale (left-hand side), 
the higher (more to the right) his status is likely to be on the occupational scale 
(base line). A correlation coefficient is nothing more than a quantitative and 
therefore objective method of expressing the extent to which these strokes fall 
in or deviate from a straight line drawn from the lower left-hand corner to the 
upper right-hand corner of this graph. If all of the strokes fell on a straight line, 
cine could tell exactly what the occupational level of an individual is from a 
knowledge of his intelligence test score. The coefficient of correlation would then 
be 1.00, showing a perfect relationship. If the strokes scatter around this line, the 
relationship is positive but less than perfect, as shown by the extent of scattering 
away from the line and by correlation coefficients ranging from .20 or .30 to .90 
or something less than 1.00. Sometimes the relationship is negative: then the 
diagonal in Figure 26 runs from top-left to bottom-right, and coefficients of 
— .20 or — .30 to something less than —1.00 would be preceded by minus signs 
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to show that the higher a person’s standing on one measure, the lower he is 
likely to be on the other. This might be the case, lor example, with intelligence 
and number of bookkeeping errors. 

Correlations of o.oo to .20 generally indicate a lack ol relationship between 
two measures, whatever the sign. But how large a correlation must be to be 
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Relationship Between Two Traits 

Eacli vertical line in the scatterdiagram repiesents one case. Thus 
tiie single stroke at the junction ol the two broken lines represents 
one person with an 1. Q. ol 110 and an occupational level score of 
75. The closer the strokes (cases) to the diagonal line (lower left to 
upper right) the higher the correlation between the two traits; the 
more they scattei away from it, the lower the correlation. 

significant depends upon the number of cases involved and upon the reliability 
of die measures. For this reason the probable error of a correlation coefficient is 
often appended by means of plus and minus signs (e.g., .24 ± .07). If the prob¬ 
able error is as much as one-fourth the size of the correlation, the obtained le- 
lationship may be due to chance factors. As the probable triors ol condition 
coefficients tend to be as high as .or, <>r .08, the cm relation generalb has to be 
abo\e .20 or .;{o to be statistically significant F\en then it mac not be pi.u tic alls 








650 


APPRAISING VOCATIONAL FITNESS 


significant for, as will be seen later, a correlation of .30 improves the efficiency 
of a prediction by only .5 percent above chance. Some investigators (the statistical 
purists) prefer to report the significance ol a correlation in terms of the proba¬ 
bility of its occurrence by chance: in such cases it may be stated that, for a re¬ 
lationship in a group of such a size to be significant at the one-percent level ol 
confidence it would have to be .35 (or some other figure). If the obtained cor¬ 
relation is lower, it cannot then be said to be statistically significant or high 
enough to be genuine. The level of confidence statement is one ol probability: 
the 5-percent level of confidence means, lor example, that such a relationship 
would occur by chance only 5 times in 100. As in the case ol the significance of 
a difference, the user ol the test must then decide whether the degree of con¬ 
fidence which he can have in the obtained relationship is great enough to seive 
as a basis for making the kind ol decision which is being considered. This de¬ 
fends also upon the degree of confidence which he can have in alternative 
bases, which is too often not ascei tainrd. Assuming large enough numbers and 
low enough probable errors, correlation coefficients are generally defined in the 
following terms: 

.80 and up: \cr\ high correlation 

.50 to .80: substantial cot relation 

.30 to .50: some coil elation 

.20 to .30: slight conelation 

.00 to .20: piaclicalh no correlation 

A word of caution should be inserted here concerning the meaning of the 
terms relationship and correlation. There is a \cr\ common and natural tendency 
to translate these into cause and effect. But relationship means, statistically as 
well as in everyday language, that things tend to be associated, to go logethei. 
Cousins are related, but they are not one the cause ol the other. It is true that 
common sense tells us that intelligence causes good marks in school, and that 
the two do not merely happen to go together or even to be the joint effects of 
a common cause. But that is what common sense tells one, not coriela- 
tion statistics. All the correlation coefficient shows is that students who are 
intelligent (or who get good marks) tend also to get good marks (or to be intelli¬ 
gent). 

Rank-Order Correlation. This is another method of computing relationship, 
simpler than the product-moment method and superior when only a few cases 
are involved. It requires only that they be ranked in order ol standing on the 
tests or other measures. Logically the concept is the same as for the more ac¬ 
curate method just discussed. In a perfect rank-order correlation (rho), for ex¬ 
ample, the student or employee who stood first on one measure would also 
stand first on the other, and so on clown the line; if the relationship were neg¬ 
ative, the highest person on one would be the lowest on the other, the second 
highest on one, second lowest on the other, etc. 
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Biscrinl Correlation. Sometimes one of the variables being analyzed cannot 
be expressed in terms of scores or numbers on a scale. Thus the criterion of 
success may be ability to learn to fly an airplane, keeping vs. losing a job, or 
some other indication of ability or status which has only two categories. In that 
case the biserial coefficient of correlation (rbis) is used. It is interpreted in the 
same manner as Pearson and rank-order correlations. 

Trtradiorie Correlation. This type (rtet) is used when both of the measures 
are dichotomous. It is therefore rarely encountered in studies of tests. 

Partial Correlation, it is sometimes necessary to ascertain the relationship 
between two measures, when the influence erf a third variable is held constant 
or ruled out. For example, if it is desired to find out the relationship between 
perceptual speed and success in machine bookkeeping, and if intelligence affects 
both scores on clerical perception tests and success in machine bookkeeping, 
the correlation between clerical perception scores and bookkeeping success will 
seem unduly high. The common third factor will make it so. Partial correlation 
ft, ) techniques makes it possible to hold constant or eliminate the influence 
of other measured factors, such as intelligence in this example, by statistical 
m hods. The interpretation of partial correlation coefficients is similar to that 
ol Pearson r’s. 

Mufti file Cm lelatton. When more than one test is used, as in selection 
batteries, it is necessary to ascertain the relationship between scores on the com¬ 
bination of tests and the criterion ol success. This is done by multiple correlation 
(R), in ’which product-moment correlations for each pair of variables are first 
computed separately and then combined. The interpretation of R is similar 
to that of r. 

Reliability. One fundamental question which needs to be answered for 
every test has to do with the- extent to which it agrees with itself. If it gives the 
same results both times when used twice with the same person, or if a score based 
on one-hall of the test agrees with a score computed from the other half, it can be 
used with confidence that it is measuring something and measuring it consist¬ 
ently. This is known as reliability, and is expressed as a correlation coefficient. 
II, on the other hand, the test does not agree with itself when repeated (retest 
reliability) or divided into halves (split-half or odd-even reliability), one cannot 
even be sure that the test is measuring something. Of course some variation in 
scores is permissible, because of chance factors, practice effects, etc., but the 
reliability of a test for use with individuals should be .85 or above. It should 
be noted that the fact that a test is reliable (measures something consistently) 
does not prove that it is good for anything. This latter question is one, not of 
reliability, but of validity. 

Validity. The second major application of correlation statistics to testing is 
in the determination of the extent to which a test measures that which it pur¬ 
ports to measure or, in fact, anything else that the test user thinks it might be 
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desirable to measure. A test devised to measure a type of auditory acuity deemed 
to be important in music was found also to be important in submarine detection 
work (see p. 321), a test designed to measure aptitude for mechanical activities 
was useful in selecting typists (page 273), and both of these were valid also for 
the purposes lor which they weie designed. There are, on the oilier hand, many 
published tests with names which imply that they are valid for some special pur¬ 
pose, but with little in the way of evidence to prove the implication. 

Evidence of the validity of a test is most often presented in the form of a 
correlation coefficient which indicates the degree of relationship between scores 
on the test and some external criterion such as grades, ratings of supervisors, 
earnings, output, or job satisfaction. When the use of common correlation tech¬ 
niques is not possible (e.g., when the criterion is not scaled but consists only of 
distinct categories or of two classes such as successes and failures), other less 
refined measures of relationship arc used. Some of these should actually not be 
called measures, as thev do not indicate the degree of relationship, but only the 
fact that a relationship exists. They are more appropriately called tests of the 
significance of a relationship. The chi-square test is one of these; their inter¬ 
pretation need not be discussed here, as they are alwa\s finally expressed in 
terms of the probability that a comparable relationship might be found on a 
chance basis. 

The phrases internal and external validity are frequently encountered in the 
literature, and the concepts arc met c\en more often in lest manuals. A test 
author’s logic in selecting the content of his test has been clear tut and strin¬ 
gent, each item in the test has a high correlation with the total score (it has 
internal consistency), and the test is reliable; these and other such facts are com¬ 
monly cited by authors of new tests as evidence of the \ahclitv of their instru¬ 
ments. Such evidence is called internal evidence of validity, as it is based entirely 
on analysis of the content of the test without reference to objective external 
criteria. Reliability indicates objectively that the test agrees with itself, but does 
not tell what it is that it measures. Internal consistency is another aspect of the 
same thing. And a critic’s evaluation of the appropriateness of the content of 
the test involves only subjective reference to external evidence, which is the 
same as that used by the test author in devising the items: author and critic 
could therefore be making the same judgmental errors. Therefore test manuals 
which cite only internal evidence of validity say, in effect, " Caveat emptor.” The 
warning should always be made explicit, arid publication of such a test should 
carry with it responsibility for taking the next steps. 

External evidence of validity is, then, the only type which really provides an 
adequate basis for judging a test, and tests lacking it are suitable only for ex¬ 
perimental use. Types of criteria against which a test can be validated are dis¬ 
cussed in Chapter 3 of this book. It is brought out there that finding appropriate 
and usable criteria is by no means simple. As they consist of such things as 
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ratings, output, and other Midi variables tliev usually lend themselves to the use 
of correlation techniques. 

The minimum acceptable lor a psychological test has generally been set at .45. 
This figure is selected because a prediction on the basis of a test or battery of 
tests with tins degtee of relationship to the criterion would be 11 percent better 
than a prediction based upon chance; to put it another way, predictions based 
on such data would be correct 111 about r,r, out ol 100 cases, wrong in about 45 
out ol ioo (that the figure 45 occurs twice in this context is not an indication 
that correlations can lie- treated as percentages, but is clue to other factors which 
need not be- gone* into here). I he setting ol a minimum acceptable validity 
coefficient, whether ol .45 or some other figure, has had the unfortunate effect 
of making many people conclude tft.it a test with less validity for a given pur¬ 
pose is therefore* ol no value. I his involves a logical fallacy which should be 
c leal eel up. 

It is true, ns the data iinplv, that a relationship expressed bv a validity co- 
eflicient of less than . is so slight as to be ol little practical value by itself. The 
fallacy is the assumption that it is used by itself. Jn practice, test data are sub¬ 
let tiveh combined with other data in estimating probabilities, whether in 
counseling or m selection. 1 hese other data may consist of evidence of financial 
hacking which vvdl make* possible an educational or a business venture, judg¬ 
ments ol motivation and chive, amount and tvpe of education received, e tc. Each 
of these also geneialiv has lclativclv little relationship with success, but the 
counselor or personnel manage 1 dusts that bv depending on a combination of 
such considerations he vvdl make bettei judgments than would otherwise he 
made. 

What psvchologv and statistics do is change trust to probability and convent 
judgments to measutes. \ comprehensive test battery is a series of measure', 
ol educational background, socio-economic level, intelligence, and whatever else 
is related to success 111 the* occupation in cpiestion. Each ol them is known to be 
related to an appropriate criterion ol success as shown bv a validity coelhcient. 
They arc* combined, not bv the judgment ol an individual, but by a regression 
equation which gives to each variable the weight which experience has proved 
it should have. 

Experience* with batteries ol well-constructed and varied tests has shown that 
measures with validity coefhcients as low as .20 may be useful (provided the 
correlation is statistically significant). It is true that, if such a test were used alone, 
the predictions would be right only 51 times out of too. But if this test measures 
some trait or aptitude which is umelated to other factors measured by a battery 
of tests, it will add appreciably to the validity of the* battery. An illustration of 
this fact is discussed in connection with the development of a custom-built per¬ 
sonality inventory for pilots (see pp. 528 ff). In this investigation a test with a 
validity coefficient of .20 and a low correlation with the battery raised the validity 
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of the battery from .66 to .70. This improvement in the correlation between tests 
and criterion would have resulted in the selection procedure being right 65 per¬ 
cent instead of 63 percent ol the time. The gain was relatively slight, but was 
made at a cost of only 20 minutes ol testing and less than one minute of scoring, 
at a stage when finding any tests which improved the battery at all was extremely 
difficult. 

This brings out a final point concerning validity coefficients: they are not 
likely appreciably to exceed .70, because ol the unreliability oi criteria. The 
logic of tins should be clear if it is remembered that when two supervisors rate 
the same employee their ratings do not agree perfectlv, or that when two teachers 
grade essay examinations the glades they give' are b\ no means identical. II the 
two sets of ratings are thought ol as two liieasuiemeius ol the* same thing (which 
is what thev aie intended to be) then it is clear that the coefficient of correlation 
which expresses the* relationship between the- two sets ol talings is a leliability 
coefficient. The ratings of two supervisees inliecjuently leach an intercorrelation 
of .70, hilling short of the desired reliability ol .Nr, or bettei. When the ciiteiion 
agrees so poorly with itself, one cannot expect even a test with a reliability of .cjr, 
to correlate very highly with the relatively unreliable criterion. 

In summarv, then, tests with validity coefficients ol as little as .20 may be use¬ 
ful; the combined predictors (some ol which may not be tests) should have a 
validity of . p, or better to be appieciablv better than chance; and no combina¬ 
tion of tests is likcly to yield a validity coefficient much above .70. 

PRFDICn ION AND pROHAIUI II V 

The use of the term “prediction” in the literature of vocational psvchology has 
been widespread. Thorndike’s earlv study (Ni»N) was entitled 7 he Pi edu lion of 
Vocational Success, and a much more recent book sponsored by the Social Science 
Research Council and edited bv Horst (y,8p is entitled The Piedit turn of Per¬ 
sonal Adjustment. The articles in professional journals which use* the 1 term are 
legion. The result is an impression that prediction is one of the major functions 
of applied psychology. 

Some Misgivings About Prediction 

But the term prediction needs to be defined, and the type of prediction under¬ 
taken by vocational psychologists and counselors needs to be made clear. Kitson 
(431) has forcefully expressed the misgivings of many psychologists concerning 
the use of this term: “Once we recognize the influence ol any or all ol these 
[personal and situational] factors on the vocational success ol an individual, we 
must acknowledge how futile arid presumptuous it is to administer a lew tests 
to an individual and, from his scores, to attempt to loretell his eventual success 
or failure. . . . Optimistic psychologists sometimes declare that we shall be able 
to predict vocational success ‘when vocational tests are more highly developed.' 
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On tliis point, William James made a pertinent observation sixty years ago: ‘It 
is safe to say that individual histories and biographies will never be written in 
advance no matter how e\ol\ed psychology may become*.’ ” 

Allport has voiced similar misgivings (12): “The fact that 72 U. of the men 
hat ing the same antecedent record as John will make good is merely an actuarial 
statement. It tells us nothing about John. 11 we knew John sufficiently well, we 
might say not that he had a 72 </ fl chance of making good, but that he, as an 
individual, was almost certain to succeed or else to fail.” 

Multiftlidty of Fad ms 

Underlying the misgivings of writers such as Kitson and Allport is the recog¬ 
nition of the fact that a person’s actions are determined by a great variety of 
forces, some of them residing within the individual, some of them essentially 
pail c>f tlu* cm ironment. Horst (383- 13-1ft) has discussed these in some detail. 

I’nsonal factors may be either congenital or environmental in origin. As has 
bee n well brought out by a number of in\estigations (e.g., 568), both constitution 
and eimtonmcnt play some* part and it is difficult to untangle their relative 
impoilance. I his fact is of significance to users of psychological tests in guidance 
and selection, because it makes then task that much more complex: the modifia¬ 
bility of a tiait or aptitude makes it necessary not only to know the chances of 
a person with a giyen amount ol jt succeeding in a given activity, but also the 
chances ol his having mtci veiling experiences which modify the degree to which 
he possesses the tiait (not to mention the probability that having experience in 
the activity itself will modify it). 

Situational fadon are often mensurable, but in many cases cannot be assessed. 
The latter are often leferred to as chance factors or luck. Among the situational 
factors affecting success which can be measured and controlled are differences 
in the purchasing power of sales territories, affecting the production of salesmen: 
differences in tlu* aspiration levels of cultural groups, affecting the output of 
factory workers; and the* possession of private pilot’s licenses by members of the 
family of would-be aviation cadets, affecting their motivation to fly and their 
orientation to flving. Some typical unpredictable situational factors which affect 
success in vocational endeavors are: illness in the family, which makes the in¬ 
dividual geographically immobile and drains energy which might otherwise go 
into his work; atmospheric conditions which make bombing difficult for bom¬ 
bardiers in that locality; the colleagues with whom the individual must work, 
such as a dishonest partner or a selfish collaborator; and the outbreak of war, 
which handicaps persons in some occupations and materially aids those in others. 

It has been pointed out by Horst (383:55) t * iat “° ,u> ()1 tllc ( hief reasons why 
many prediction procedures have not attained a higher level of accuracy has been 
their failure to take into account contingency factors.” Contingency factms are 
those personal and situational factors which affect performance but for which the 
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probability of subsequent presence or absence is not known at the time of 
prediction. Thus there is no way of knowing what the health of unborn children 
will be. with its possible effects on the occupational mobility of the father; nor 
is there any way of knowing when he is a sophomore in college the particular 
type of sales job and territory a potential salesman will get. It is the failure of 
most psychologists who write about “prediction” and publish studies of the 
predictive value of tests to take such factors into account that has led Allport, 
James, and Kitson to criticize the use of the term and to despair of actuarial 
predictions in applied psychology. 

Taking Contingency Factors Into Account 

But others have not been so pessimistic, as is brought out by the mere fact 
of the publication of the Social Science Research Council's monogiaph (383). 
Horst lists three methods of dealing with contingency or “chance” factors which 
have been proved promising: 

1. Adjust the oiteuon seme in trims of the c ontmgency. Thus of two sales¬ 
men with equal sales volume but in territories, one of which has a high level of 
purchasing power and the other a low level, the salesman in the latter is to be 
considered more successful, and his criterion score (sales \olume) is corrected by 
a statistically derived weight. Phis shows that he has leallv been more successful 
than his mate in selling an equal amount under more difficult conditions. 

2. Treat the contingency factor as one of the • fn edn five elements. In the- case 
of the potential salesmen still in college this method would not work, for there 
is no wav of knowing what t\pe of territorv he will be given, \lthough suggested 
by Horst, this technique is actually not applicable to true contingency factors, 
for if by definition the probabilitv of their presence cannot be known at the time 
of prediction that item cannot be scored in making the prediction. Horst's ex¬ 
ample (383:56) is drawn from the* prediction of academic success: he states that 
weights mav lie assigned for given amounts of time spent in outside work. But 
this involves knowing how much time is spent in outside work by a given in¬ 
dividual. Ihe so-called contingency lac tor then becomes a known variable and 
a predictive lac tor. I he procedure is comparable to scoring a would-be life in¬ 
surance salesman’s biographical data blank according to the known relationship 
of age, marital status, and amount of insurance carried. Prediction studies in 
which such ascertainable factors are not included among the- predictors are legit¬ 
imately to be criticized: il prediction is to be attempted, all potentially revelant 
and mensurable factors should be included among the predictors. 

3. Predict the contingency factoi. If the college student who is considering a 
career in life insurance is to be tested and an estimate of his probable success in 
that field is to be attempted, it is difficult to know how to weigh such items as 
marital status and amount of insurance carried. At his present age he has not had 
the opportunity to marr\ or to carry insurance which he will have had after he 
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has been out of college several years. But the probability of his getting married 
and carrying insurance can be ascertained. Horst suggests that the prediction 
formula might include factors related to subsccpient marriage, and that, in pre¬ 
dicting the contingency (marriage), the prediction of the activity (selling) can 
be improved. 

The application of these methods adds considerably to the complexity of the 
prediction procedure. In the case of military pilots in World War II, for ex¬ 
ample, it meant that the prediction ol success ol lighter pilots in combat had to 
be broken down into success as a fighter-pilot-with-ecjuipment-superior-to-that- 
oi the-cneiny-in-a-theater w ith-considerable-aii-opposition, as a fighter-pilot-with- 
equipme nt-supcrior-to-diat -of-thc-encmy in-a-theatei-with-little-air-opposition, as a 
lighter pilot wit h-eejuipmen tin lei ior-t o-that-ol-the-enenivi n-a-thea ter-with-consid- 
eiable-aii-opposition, and other such categories. It meant that tlu* test battery 
had to include not only aptitude tests of the usual t\pes, but also biographical 
data blanks co\ering such factors as marital status and age (younger men are 
more likel\ to succeed than older, but married men are better risks than single), 
previous living experience (those who flew voluntarily as civilians are most inter¬ 
ested), having a pilot relative (having a flier in the family seemed to mean having 
living “in the blood"), and ur han\s-rural origin (city hoys are less likely to 
succeed in living training than those who are more used to outdoor life). 

Even wheat the criterion becomes as rehned as possible, and even when the 
list ol predictors is made .in inclusive as job analysis, man analysis, and ingenuity 
of test construction permit, iheie are still many factors which ate* not covered. 

These ale the- true contmgencv factors, those which are most truly matters of 
chance, such as the hoiiestv ni a partner, the outbreak ol war, and epidemics. 


In view of the I act that no prediction of human behavi >\ vocational or other¬ 
wise. can take into account all revelant factors, it scene wise to use the term 
"prediction” cautiously and with a full awareness of its definition. As used by 
statisticians the term “predict" is more or less synonymous with to “estimate", 
as in the prediction or estimation of weight from height or of a son’s intelligence 
from his lather’s. Knowing one, it is possible to make a “best estimate” of the 
other which, while often not accurate, is much better than a pure guess. There 
are times when one or more such correlates are known, when others are not 
known, and when decisions or judgments need to be made. The best estimate is 
then helpful. 

But a best estimate is merely a statement of probability. It says, in effect, “there 
;irc 7 chances in to that this man is not heavy enough to move this load, intelli¬ 
gent enough to succeed in a highly selective college, or aggressive enough to 
make good as a house-to-house salesman,” whichever the case may be. It should 
he noted that these statements are not predictions, they are statements of the 
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probability of one specific type of behavior in one specific type of situation. It 
is not success in factory work, in college, or in sales work which is predicted. 
Rather, it is the probability of a person being heavy, bright, or aggressive enough 
to perform a specified task which is estimated. The form in which the estimate 
is expressed makes it clear that other factors which may affect success are not 

Pilot Number 

Stanine of Men Percent Eliminated in Pilot Training 

9 

8 

7 

6 

5 

4 

3 

2 

1 

Total 

DATA FOR ESTIMATING CHANCES or SUCCESS IN PII.OT TRAINING 

Actuarial data for this group of more than 50,000 aviation cadets 
showed that cadets with stanincs of 9 have 87 chances in too of 
completing pilot training (primary through advanced), whereas those 
with stanines of 5 have slightly less than a 50-50 chance ol com¬ 
pleting training, and only about 19 in 100 ol those with stanines of 
1 succeed. After DuBois (214:145). 

taken into account. The estimate of ability to move the load, obtain passing 
grades in college, or make sales can be made still better by taking other things 
into account; this may be done as Horst suggested, by including them among the 
predictors or in the criterion, or by subjective modification of the probability 
estimate. For example, one might take into account the previous physical 
activities of the laborer and the type of equipment used to move the load, the 
educational achievement of the college student’s mother and his own expressed 
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attitudes toward college, and the financial need and past social achievements of 
the salesman. 

Estimating the Probability of Success in flying. Perhaps the meaning o-f esti¬ 
mates ol probability in vocational guidance and selection can best be made clear 
by means of a specific example, 50,000-odd cadet pilots processed by the AAE 
aviation psychology program, figure 27 shows the number of men in each stanine 
group, and the percentage of failures in each group. The length of each bar is 
proportionate to the percentage of men eliminated from flying training. Approx¬ 
imately Hi percent of the men who were sent to flying training despite stanines of 
1 (a piactice discontinued early in the war) failed to make good. More than 70 
percent o! those with stanines of 2 and $ also failed. Pint only about 13 percent 
of those who made stanines of 9, and 24 percent of those with stanines of 8, 
failed to make good. As shown also on page 2, for all stages of training this 
battery of tests obviously had considerable value in differentiating the men who 
were likely to succeed from those who were likely to fail in flying training. The 
relationship in Figure 27 is expressed by a biserial correlation coefficient of .y,8 
between pilot stanine and success in pilot training, which is raised to .49 when 
coriectcd for restriction of range- (it was .6*$ foT an unselectcd experimental 
group of over 1000 cadets [ 2 1 ] • 191 ]). 

It is pertinent to point out that the battery of tests used included tests such 
as those which constitute intelligence tests (arithmetic reasoning and reading 
comprehension), tests of spatial v isuali/ation, general information, mechanical 
comprehension, mathematical achievement, co-ordination, finger dexterity, per¬ 
ceptual speed and reaction trine, etc. It also included a biographical data blank 
covering family background, education, occupational experience, hobbies, urban- 
rural experience, etc. Although it did not include such measures of personality 
as the- Rorschach, interview impressions, and the like, such indices had been 
tried and we re found to have no validity for predicting success in flying. It was 
therefore about as compre hensive* a battery as could he devised. Such contingency 
factors as were not provided lor were probably not such as could be taken care 
of without an undue additional expenditure of time and money. With this in 
mind, let us consider the estimates of probability, the "predictions,” which may 
be made on the basis of this combination of predictors. 

Approximately 81 percent of the 67 pilot cadets who made stanines of 1, but 
were sent to training nevertheless, failed to complete pilot training, having been 
eliminated for flying deficiency or fear, or at their own request. The odds may 
therefore be said to be lour to one against a person with a stanine of 1 succeeding 
in flying school. This is a statement of probability which is rather impressive 
and, when other candidates are available, is certainly evidence in favor of not 
selecting those with such stores. But suppose that one is concerned, not with the 
selection of large numbers of men from a much larger pool, but rather with the 
evaluation of the chances that a particular individual, John Smith, who made 
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a stanine of one, will make good as an Air Force flier. The odds are still four to 
one against him: but there is, conversely, one chance in five that he will succeed. 
These are not hopeless odds, and Smith will certainly argue, if given an oppor¬ 
tunity, that he is the one poor risk in five who will succeed! The personnel 
worker, psychologist, or counselor cannot deny his contention. All he can do is 
point out that each of the other poor risks feels the same way (they did when 
the writer interviewed large numbers of them early in World War II), and that 
experience shows that approximately four-fifths of them fail nonetheless. He 
must recognize that the prediction is for a group: four-fifths of it will fail. Foi 
any given member of the group all one has is a fnobability statement : the odds 
are four to one against him. Only experience can show whether John Smith 
would be one of the 81 failures or one of the 19 successes in every 100 men like 
him. 

The same can be said of the high stanine men. Of those who make stanines of 
9 (8,076 men in this group of 50,597), only about 15 in each 100 fail in flving 
training. The odds are therefore overwhelmingly in favor of the cadet who makes 
a score of 9. they are about 7 to 1. Put 15 in every 100 such cadets did fail, and 
Cadet Jack Doe, who made a score of 9, has no way of knowing whether he is 
one of the 87 or one of the 13. Neither has the personnel officer, the counselor, 
or the psychologist who helped develop the tests. 

The examples just cited are the most clear cut possible, for they are selected 
from the extremes of the distribution. Consider the men who made average 
scores, Cadet Jim Dale, for example, with a stanine ol 5. In this sample of 
50,000-odd who went to pilot training alter taking the cadet tests, there were 
some 8,000 who made stanines of 5. About 48 peicent of these failed in Using 
school. The odds are therefore about 50-50 that Dale will succeed, but the re is 
no way of knowing whether he will be one ol the 52 in 100 who pass, or one of 
the 48 w’ho fail. He may consider it worth the risk, and so may others ii theie 
is a manpower shortage. Put when other goals are equallv attractive the candi¬ 
date may well prefer greater odds in his favor, and when more promising 
candidates are available personnel may legitimately reject him in favor of others. 
In neither case is there any prediction that Dale will either succeed or fail: there 
is only a statement of probability. 

Confusion in Counseling. Unfortunately, probability statements are viewed bv 
many persons as predictions. The result is that, having heaid a great deal about 
the predictive value of tests used in selecting groups of men and women for 
military or industrial assignments, many people come to vocational counselors 
and psychologists to be tested in order to find out what they are “best fitted for." 
The success of vocational appraisal procedures for prediction in one sense (the 
tendency of groups ter succeed or fail) has created impossible expectations of 
these same procedures when used for another purpose (the appraisal of individ¬ 
uals). The result is a feeling of disappointment on the part of those seeking 
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guiclance, and one of frustration at not being able to work effectively, in ways 
appropriate to their tools, on the part of counselors. The situation would be 
improved if the general public, and some psychologists whose absorption in test 
c(instruction has caused them to lose sight of the context in which they work, 
would cease to think in terms of predicting the success or failure of individuals, 
and come to think in terms of probabilities, some of the contingent factors of 
which will remain unknown even after the most thorough of testing and inter¬ 
viewing procedures. 

The Accuracy of Estimates 

One final point needs to be made concerning the estimation of a person’s 
standing on one scale, let us sa\ production records, from his standing on 
another scale, such as a battery of tests. The imperfect relationship between 
test battery and criterion means that instead of yielding a point on the criterion 
scale, the correlation coefficient yields a zone of approximation. Stated in nonsta- 
tistical and concrete terms, when we estimate the amount of insurance which 
an applicant for a job as insurance salesmen will probably sell from the scores 
which he makes on selection tests, the result must be expressed, not as “$ioo,- 
ooo.oo,” but as ”$ioo.ooo.oo plus or minus $30,000.00,” or as ‘‘from $70,000.00 to 
$130,000.00.” The higher the \aliditv coefficient, the narrower the zone of approx¬ 
imation, that is, the closer the estimate is to a specific figure. Conversely, the 
lower the \aliclity coefficient, the wider the zone of approximation, or the greater 
the range of sales which ma\ he made b\ the potential salesman. When the 
correlation between tests and sales is zero, the /one of approximation ranges 
from nothing to infinity* what the salesman will sell is a matter of guessw'ork. 

Table 35, commonly called a prediction table, makes it possible to ascettain 
the most probable criterion score and the /one of approximation for any given 
test score, gi\en a validity coefficient, test scores which can be expressed in terms 
of standard scores or percentiles, and criterion scores which can be expressed in 
the same was, l est scores are normally available in this form, and many criterion 
data can Ire converted into these terms. For example, if the criterion is dollar 
value of insurance sold per annum, this figure can be ascertained for each sales¬ 
man, and the salesman’s standing can be compared to that of other salesmen of 
the same product for the same company, and converted into a standard score 
or percentile by the usual methods. 

To use this table, one enters it with the score on the known measure (test) by 
means of the score stale at the top. Following the appropriate column down, one 
stops at the row r opposite the r corresponding to the actual correlation between 
predictor and criterion. The figure where column and row meet is the most 
probable criterion score. The column headed k (standard error of estimate) 
indicates the amount to be added to and subtracted (after multiplication by 10 
to match the standard scores) from the criterion score to give the zone of ap- 
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Table 35. 

ESTIMATION TABLE 

To estimate a person’s most probable standing on a criterion score (expressed as a 
standard score 01 percentile) from his standing on a test, knowing the correlation 
between test and criterion. 
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68 in 100 that a person with a given test score will be placed on the criterion. 
Flic column headed K gives the same data multiplied by 10. An example follows. 


illustrated in Figure 28. 
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Figure 28 

THE ZONE OF APPROXIMATION 


Let us suppose that the correlation between the score on a test used in select¬ 
ing packers and number of boxes packed per hour is .50. Applicant Petty Jane 
makes a standard score of 55 on the test (69th percentile) when compared with 
the criterion group of applicants lor such work. Locating 55 on the scale at the 
top of Table 35, we follow it down the corresponding column to the row op¬ 
posite .50 in the column headed "r” (.50 because this is the validity coefficient for 
this test when used lor this purpose). The figure at which we stop is 52.5. The 
standard error of estimate (/c) corresponding to a validity coefficient of .50 is 
shown in column two to be .87 sigma of the number of boxes packed. As stand¬ 
ard scores are used, the standard deviation of the scores is 10, and the standard 
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error of estimate in score units is 8.7. 7 ’he most probable production score of 
Betty Jane is therefore 52.5 (Goth percentile), and her zone of approximation is 
5 2 -5 - 8 - 7 ’ or 43 - 8 to 61.2. This is shown graphically in Figure 28. This figure 
brings out clearly the rough nature of the estimate provided by a validity co¬ 
efficient of .50; standard scores of 44 and 61 correspond to percentiles of 27 and 
87. In other words, there are G8 chances in 100 that Betty Jane’s standing as a 
packer, if she is employed as such, will be somewhere between the 27th and 
87th percentiles when workers are ranked according to output. She may be a low 
average, an average, or a superior worker, although she is most likely to be high 
average. And as there are only 68 chances in 100 that she will be found that 
effective, there are also 16 chances in 100 that she will be less effective than 
that, placing somewhere in the bottom cjuarter of packers, and iG chances in 
100 that she will prove better even than the 87th percentile, placing near the 
top of the group in number of boxes packed per hour. 

To summari/e these facts briefly, Betty Jane, who made a high average score 
on a test which is as valid for its purpose as most tests now in use, may prove 
after employment to be a low average, average, high average, or superior worker. 
The odds are 2 to 1 that she will be in one of these categories. But she might 
turn out to be either one of the least effective workers, or one of the best produc¬ 
ers in the plant: the odds ate only 5 to 1 against either of these proving to be 
the case. Such probabilities are useful, as they give one a definite basis for making 
a decision, but they clearly provide ordy a “best estimate” of what Betty Jane 
will do, not a prediction. 




APPENDIXB 


TEST PUBLISHERS AND SCORING 
SERVICES REFERRED TO IN TEXT 


American Council on Education 
(see Educational 'I'esting Service) 

American Institute lor Research 
Cathedral of Learning 
Pittsburgh 13, Pennsylvania 

Association of American Medical Col¬ 
leges (see Educational Testing Serv¬ 
ice) 

Bureau of Educational Research and 
Service 

University of Iowa 
Iowa City, Iowa 

C. II. Stoelting and Company 
424 North Homan Avenue 
Chicago, Illinois 

California Test Bureau 
5916 I Lilly wood Boulevard 
Los Angeles 28, California 

Center for Psychological Service 
George Washington University 
2026 G Street, N.W. 

Washingtin, D.C. 

Cooperative Test Service 
15 Amsterdam Avenue 
New York 23, New York 


Division of Applied Psychology 
Purdue University 
Lafayette, Indiana 

Educational Records Bureau 
437 West 59th Street 
New York 23, New York 

Educational Test Bureau 
720 Washington Avenue, S.E. 
Minneapolis, Minnesota 

Educational Testing Service 
Box 592 

Princeton, New Jersey 

Engineers Northwest 
too Metropolitan Life Bldg. 
Minneapolis 1, Minn. 

Grune and Stratton 
381 Fourth Avenue 
New York, New York 

Harvard University Press 
Cambridge, Massachusetts 

Houghton Mifflin Company 
2 Park St. 

Boston 7, Mass. 

McKnight and McKnight 
Bloomington, Illinois 
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Marietta Apparatus Company 
Marietta, Ohio 

Psychological Corporation 

522 Fifth Avenue 

New York City. New York 

Psychological Institute 
Lake Alfred 
Florida 

Public School Publishing Coni pan) 
Bloomington, Illinois 

Science Research Associates 
228 South Wabash Avenue 
Chicago, Illinois 

Sheridan Supply Company 

Beverly Hills 

California 

Stanford University Press 
Stanford Universitv, California 


United States \ir Force (Aviation Psy 
chology Program) 

Washington, D.G. 

United States Employment Service 
Washington. D C. 

University of Iowa 
Iowa City, Iowa 

University of Minnesota Press 
Minneapolis. Minnesota 

West Publishing Company 
St. Paul, Minnesota 

Williams and Williams 
Baltimore, Man land 

World Book Company 
Yonkers-on I ludson, New York 
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